Tag Archives: CSV

Data Volumes

Despite what they say, size does matter. Successful data management is all about finding the proper tools and formats for dealing with your data.  There is no one-size-fits-all solution.  And the very first question you should be asking yourself is: …   read more …

Posted in Data Management | Tagged , , , , | Leave a comment

Zero vs. Missing

On the left we have zero, our integer measure of nothingness.  On the right we have missing value, aka N/A, aka NA, our signal that the value of a datapoint is unknown.  Everyone who deals with data has to deal …   read more …

Posted in Data Management | Tagged , , , , | Leave a comment

Ten UNIX commands every data manager should know

Working with data from varied sources can be frustrating — some data will be in CSV format; some in XML; some available as HTML pages; other data as relational databases or MS Excel spreadsheets. This post will cover the UNIX …   read more …

Posted in Data Management | Tagged , , | Leave a comment

Liberating data from Microsoft Access “.mdb” files

Many people given the task of managing data reach for the tools available to them on their office computer.  Typically this will include the Microsoft suite of products including the Access database.  Not surprisingly, Microsoft Access Database files, “.mdb” files, …   read more …

Posted in Data Management | Tagged , , | 1 Comment

When is a number not a number?

Have you ever asked yourself whether your telephone number is really a number?  It’s got numbers in it but does it measure anything? How about your credit card number?  PO Box?  Social Security Number?  What would happen if you subtracted …   read more …

Posted in Data Management | Tagged , , | Leave a comment