Linux News
The world is talking about GNU/Linux and Free/Open Source Software

Login

If you don't have an account yet, visit the registration page to sign up.

If you already have an account, you may login here:

Today's Big Story

10 Best New Features In Ubuntu 25.04 “Plucky Puffin”

LXer Features

Linux That's Small

Encryption, Trust, and the Hidden Dangers of Vendor-Controlled Data

My Linux Mint Tribute

How I Turned My Chromebook Into A "Mintbook"

Adventures With My New Chromebook

My Linux Laptop

Have something to say?

Ready to be published? LXer is read by around 350,000 individuals each month, and is an excellent place for you to publish your ideas, thoughts, reviews, complaints, etc. Do you have something to say to the Linux community?

Publish it here.

DaniWeb Linux Community
An exciting professional discussion group about software development, php, shell scripting, networking, ruby, and more.

Latest Discussions

Time travel baby!

Are they

Well... probably not as good as some might think

Plasma-desktop 4:6.3.0-1 MIGRATED to Debian testing

Uh, you're being lied to.

Debian May Be Leaning Towards Systemd Over Upstart

Retro Remake products

They're Blocking Linux Articles While Running Linux Servers?

How much did they pay?

I still like Damn Small Linux Myself

More...

Site Menu

Other News

- LWN.net
Their weekly coverage of Linux news is unmatched in this community.

- LinuxGizmos.com
Excellent news for embedded Linux.

- LinuxQuestions.org
Discussion forums for Linux users.

LinuxQuestions.org is a friendly and active Linux Community with forums, reviews, a hardware compatibility list, a wiki, tutorials, a download site, a podcast and more.

grep vs AWK vs Ruby, and a uniq disappointment

Posted by eldersnake on Mar 26, 2017 6:13 PM CST
The Linux Rain; By Bob Mesibov

Mail this story
Print this story

In my data-cleaning work I often make up tallies of selected individual characters from big, UTF-8-encoded data files. What's the best way to do this? As shown below, I've tried grep/sort/uniq, AWK and Ruby, and AWK's the fastest. The trials also revealed an unexpected problem with the uniq program in GNU coreutils.

Full Story

Nav

» Read more about: Story Type: Editorial; Groups: Developer, GNU, Linux, Standards

« Return to the newswire homepage

This topic does not have any threads posted yet!

You cannot post until you login.

Linux NewsThe world is talking about GNU/Linux and Free/Open Source Software