Saturday, March 21, 2009

Counting lines

Situation: a process is creating a big amount of files (ASCII) with quite a big amount of lines in each. You need to check how many lines are created in total (all files) at a given time, so you can track progress of the process. Fields in the files are fixed length.

Obvious solution: do something like wc -l *

Problem: for really big files and a lot of them, the time taken by wc is really big and a lot of resources are used for this count. A real time solution cannot be implemented this way.

Smarter solution: find out how many characters make a line in a file, divide the file size by this number and you get number of lines in the file. TO find out total number of lines in all files, just sum together the number of lines for each file.

Implementation: tail -1 | wc -l gives the number of characters in a line; ls -l gives size (in characters) of a file. So,
ll * | awk '{cnt=cnt+$5;}END{printf("%ld",cnt/[chars_per_line]);}'
gives the answer.

For starters...

For more than 12 years I'm dwelling in the software development land. That is I'm making a living out of it. I've been doing it before as a hobby and I continue to do it today.
What I like about it is that I keep finding new and interesting stuff to deal with. Not necessarily regarding new technologies, although these come up on a faster and faster rate than ever, but regarding even day to day work.

About this I want to talk in this blog. About things that can make your life easier as a software developer, things that make your work experience brighter and more entertaining. And, of course, about things that can make you think it was not such a bad decision to get into this profession.

Therefore the articles here will not deal with a particular programming language and will not present the latest trends or technologies, there are all too many websites dealing with these.
This blog is just suppose to put a smile on your face by presenting nicer, faster and, maybe, smarter ways to perform your daily routine.
Another point is: no hear say! Only what comes from trial and experience. So, the posts will happen when something interesting comes along, not from the need of filling up the space.

Futile to say that many (most?) of the posts will be of a too low level for experts of a particular programming language or technology. But all together, I hope, should make a nice addition to somebody's tools of the trade.

P.S.: On 2nd thought why not some tech news too?! After all it's all about making you smile more... So, if (high) tech news is your stuff, dig in!