Why you need a mutex to protect an int

This is a follow up on my previous post about need to protect access to even simplest variables when working in multi-threaded environment. In this post I would like to explain what’s going on under the hood and why you actually need some protection here.

Read the rest of this entry »

Cautionary tale about using threads and fork()

I ran into one interesting problem. I found a process stuck on mutex. This sounds like a common deadlock, but it wasn’t that obvious. Thread that was stuck is the only thread in the process. 

Read the rest of this entry »

Spam

Folks, there seems to be some technical issues with “Notify me when new comments arrive” feature. The feature allows you to post a comment on a alexonlinux.com and receive email notifications when someone responds.

I got few complains about spam comments getting to people’s mailbox. And now it sends multiple emails too. So I decided to disable this feature. Please accept for apologies for the inconvenience. Please let me know if you see other problems.

Best regards,
Alex.

Bloom filters

Bloom filter is a data structure that contains set of elements. Unlike regular data structures it cannot contain data that is associated with certain key. Neither it can contain keys themselves. The only type of information it can contain is whether certain key belongs to a set or not.

You must be wondering what it is useful for. Here’s typical scenario for using bloom filter. Lets say you have large data structure and you often have to check if particular member is in the data structure. For example, lets say you have large binary tree and you often query the tree if it contains some element.

Read the rest of this entry »

printf() vs stream IO in C++

Before joining Dell I was mostly working in kernel writing in C programming language. At Dell I still work on mostly low level stuff, but this time it is user-mode, so I am not tied up to C anymore. We’re writing in C++ and I am learning C++. One of the less appealing things for me in C++ was streaming support and the way its input/output implemented. In particular I got used to printf() functions family and leaving those in favor of streams and cout was tough. What really strikes me is the fact that no C++ book explains this stuff. All C++ books just tell you – so, my dear, this is how this stuff is done in C++. It took me some time to realize how C++ style input/output is much more convenient and powerful than printf() family of functions. Here’s why.
Read the rest of this entry »

gcc macro language extensions

One of the great things about gcc and in particular its C/C++ preprocessor is various extensions that it has. In this post I would like to briefly describe three of them. One allows to turn C/C++ token into a string. Here token is anything that you can pass as an argument to a macro. Second allows you concatenate two tokens to create new expression. The last one allows C/C++ macros with variable number of arguments.
Read the rest of this entry »

UML cheatsheet

Every once in awhile, I have to draw a UML diagram. I rarely do serious designs with UML, however sometimes I do need to depict some piece of code in a diagram and UML seems to be the best notation around.

Unfortunately, various sources of information on UML tend to over-complicate things. I am not software architect and drawing UMLs is not my job. So my UML skills are poor by definition. Moreover, I am happy with this situation and don’t see it changing in the future (even if I get promoted ;-) ).

So from time to time I need a simple UML reference card. Simple search finds references like this one, which are excellent if you are serious about UML, and I am not.

Eventually, I decided to write a short UML class diagram reference card for myself. I hope you will enjoy it as well.
Read the rest of this entry »

Making writes durable – is your data on disk?

Here is an interesting article written by Evan Jones. The article explains how you can be guaranteed when your data is on disk.

In case you’re wondering, when write(), fwrite() or any other library call that writes data to disk reports success you are not guaranteed that the data is actually on the disk. In fact, in Linux, write() reports success when data is in dirty cache. Then, special kernel thread kicks in and makes sure that the data is on disk.

Depending on circumstances, it may take some time until writer kernel thread will finish writing. Anyway, in his post Evan talks about how to make sure that the data is actually stable on disk.

Models for multithreaded applications

As you know, I changed a couple of workplaces during my career. Long story short, one interesting thing that I noticed in different companies is various models for multi-threaded programs (mostly for large embedded systems).

Read the rest of this entry »

Python for bash replacement

When I started learning Python, I was looking for a programming language that would replace BASH, AWK and SED. I am a C/C++ programmer and as such I better invest my time into studying C and C++. Instead, every time I needed some complex script I opened up a book on BASH and refreshed my knowledge. And since bumping into boundaries of what BASH can do is relatively easy, I always opened awk/sed book few minutes later.

Actually, this is quiet common. Once in a while I see my colleagues, just like myself, open up a book on BASH. The problem is that because we don’t actively program BASH, the knowledge and experience that we gain from this experience wear out over time. So next time we approach, so we have to repeatedly study BASH stuff over and over again. And again, this is not only BASH I am talking about, but also AWK and SED.

It is utterly broken state of affairs and I wish there was a solution. Unfortunately there is no solution yet. The good thing is that with some effort the solution may arise. I am talking about Python programming language.

Read the rest of this entry »