Sunday, May 06, 2012

New GIT Time Extractor Gem

I am happy to announce that the git_time_extractor project, released publicly on github.com in February, is now officially available as a RubyGem!

git_time_extractor is a small command-line tool that produces a CSV time log for each developer/contributor to a project tracked in GIT. It uses the commit timestamps and three basic reasonable assumptions, approved by the developer's accountant.

It other words, computes the estimated time spent by developers working on code within a GIT repository. This is particularly useful for verifying developer timesheets and for tax purposes.

The benefits are

  • Save money on taxes by producing documents required by accountant to properly apply for certain tax credits
  • Track development time without time sheets
  • Compute time records based on the timestamps of each code commit by each developer
  • Compare these results with other time-sheets or metrics to measure the effectiveness of your team members

To do this, git_time_extractor makes a few assumptions

  • A series of commits within 3 hours are part of the same working session
  • A first commit is considered to represent 30 minutes of working time
  • Frequent commits increases the accuracy of the estimated time

To get started

Installing is easy from your computer with Ruby installed.  From the command line, run:
gem install git_time_extractor
Then go one of your GIT project directories, and run:
git_time_extractor > output_time.csv
 

Then open the output_time.csv file with your favorite spreadsheet editor, be it Microsoft Excel, OpenOffice Calc, Google Apps, or something else.

Upcoming talk at the AtlRug Meeting

I am giving a short talk on the new git_time_extractor gem at the May 9, 2012, Atlanta Ruby Users Group meeting at 6:30 pm. If you are in the metro Atlanta area, please join us. It's a good group of developers. There is usually free pizza and good company as well.

Wednesday, April 18, 2012

mod_deflate: Dramatic website speed increase with Apache compression on Ubuntu Linux

It's easy to dramatically improve the loading speed of your website by enabling compression support in your Apache web server. This works without needing to make any changes to your backend or database code no matter what programming platform your development team uses.

I recently added this change to a Ruby on Rails-based web application running on a Ubuntu Linux dedicated virtual server. That particular page observed a 72% decrease in bandwidth required to deliver the homepage.

It has been my experience with our internal and with client RoR sites that mod_deflate tends to provide a 62% to 72% savings on bandwidth required to deliver each page. This is significant for visitors on slower connections or mobile phones. It is well worth enabling for most applications.

Three Easy Steps

Step 1: Enable mod_deflate in your Apache2 installation

sudo a2enmod deflate

Step 2: Edit /etc/apache2/httpd.conf to configure mod_deflate to compress certain file types

<IfModule mod_deflate.c>
# compress text, html, javascript, css, xml:
AddOutputFilterByType DEFLATE text/plain
AddOutputFilterByType DEFLATE text/html
AddOutputFilterByType DEFLATE text/xml
AddOutputFilterByType DEFLATE text/css
AddOutputFilterByType DEFLATE application/xml
AddOutputFilterByType DEFLATE application/xhtml+xml
AddOutputFilterByType DEFLATE application/rss+xml
AddOutputFilterByType DEFLATE application/javascript
AddOutputFilterByType DEFLATE application/x-javascript
AddOutputFilterByType DEFLATE image/x-icon
</IfModule>

Step 3: Restart Apache to enable the changes 

sudo apachectl graceful

Test if it Worked

Once you have made the changes, test if the deflate is working properly by browsing through the website with all of your browsers. Then confirm that the connection is running deflate using a third party tool. One such is http://www.whatsmyip.org/http-compression-test/.

References




Tuesday, April 17, 2012

Big data a big deal for SQL Server 2012, users say

TechTarget just published an article by Alan Earls in which both Sanjay Bhatia, a fellow founder of Atlanta-based database company (and ATDC graduate), and myself were quoted.

Big data a big deal for SQL Server 2012, users say:

For Frank Rietta, the fact that Microsoft has been willing to work with an established open source project is the best part of SQL Server 2012.
Rietta is talking about the Apache Hadoop integration with SQL Server 2012. He is a software developer and president of Atlanta-based Rietta Inc., one of the member companies at the Advanced Technology Development Center (ATDC), which runs the incubator program at the Georgia Institute of Technology.
“Big data is becoming increasingly important even for small startups,” he said. And Microsoft’s partnering with established open source project is “refreshing,” he said.
Please check out Alan's entire article.  The database server space is getting very interesting for small businesses and startups, who increasingly have access to technology that was previously only accessible to the largest organizations.

Thursday, April 12, 2012

SQL Saturday in Atlanta; SQL Converter to support PostgreSQL

I am looking forward to being at the SQL Saturday event at Georgia State University in Atlanta this Saturday!  http://sqlsaturday.com/111/eventhome.aspx.  It's exciting to see the traction that the SQL Converter Expert Conversion service is getting.  We're going to be adding some really great stuff soon, including PostgreSQL support.

Thursday, April 05, 2012

What is Protected Personally Identifiable Information? Do I really have to hash users' passwords?


The Short Answer
The legal answer depends on which Federal, State, and local laws apply to your company.  And I am not a lawyer.  However, for companys whose nexus is in Georgia, where my company is located, the Georgia General Assembly has given some guidance in the data breach law.

And yes, you really do have to hash your users' passwords or you risk having to do a full blown Data Breach Notification if the user's table is ever compromised!

The Details

The exact definition of personal information varies among states.

However, since my company is located in the State of Georgia, and so are many of our clients, I will use the Georgia State Data Breach Notification Law as an example.

OCGA 10-1-911 (Official Georgia State Law) defines it as:

“Personal information” means an individual's first name or first initial and last name in combination with any one or more of the following data elements, when either the name or the data elements are not encrypted or redacted:

  1. Social security number;
  2. Driver's license number or state identification card number;
  3. Account number, credit card number, or debit card number, if circumstances exist wherein such a number could be used without additional identifying information, access codes, or passwords;
  4. Account passwords or personal identification numbers or other access codes; or

The term “personal information” does not include publicly available information that is lawfully made available to the general public from federal, state, or local government records.

In general, the Georgia General Assembly has expressed concern over the threat of identity theft.

My understanding is that if you do not hash the passwords in your database and it is leaked then you have to do a full formal data breach notification. People tend to use the same passwords are multiple places. Secure hash algorithms with salting are your friends here.

Other Potentially Applicable Laws

In general, security standards are either required for all entities that handle certain information (law mandates) or by contractual agreements with a private party or a government agency.

U.S. Laws

  • State Data Breach Laws
  • Health Insurance Portability and Accountability Act
  • Gramm-Leach-Bliley Act
  • Sarbanes–Oxley Act of 2002 (SOX)
  • Family Educational Rights and Privacy Act (FERPA) 
 

Contractual Agreements


  • Payment Card Industry Digital Security Standards (PCI-DSS)
  • Federal Data Security Standards
  • NIST SP 800-53, Recommended Security Controls for Federal Information Systems

Monday, April 02, 2012

New Lines in Cell Data: The SQL Converter Expert Conversion Service Can Handle It!

The SQL Converter Team just had a client whose spreadsheet contained lots of new line characters in individual cell data.  Microsoft Excel 2010 on Windows was removing these new lines when saving the data as CSV or even when attempting to run it through the classic SQL Converter for Excel add-in.

Despite the native Excel limitation, the SQL Converter Expert Conversion Service was able to handle the data and convert it to MySQL 5.  We updated our Ruby-based conversion tool to directly read cell data from an Excel 2010 document and created a SQL dump file that preserved the new line characters within the cell data.

Do you have spreadsheet data that needs to be converted into SQL?  Just send us your spreadsheet and let our professional data conversion specialists do the conversion to SQL for you. Nothing could be easier.  Learn more at:

http://www.sqlconverter.com/expert.html


Saturday, March 03, 2012

Best Data Types for Currency/Money in MySQL?

So you want to know what the best data type is for storing currency/money values in MySQL?

The Short Answer

When handling money in MySQL, use DECIMAL(13,2) if you know the precision of your money values or use DOUBLE if you just want a quick good-enough approximate value.

The Details

On some database management systems (DBMS) there is a data type specially set aside for handling money.  However, MySQL doesn't have one so the choice of best representation is left to the programmer.

Since about 1999, using MySQL 3, I have generally used the DOUBLE data type to represent money values as it stores decimals at reasonably high precision. The values stored are approximations, but the number of decimals treated accurately is enough for normal money values.  The ROUND function is helpful when doing aggregation queries, such as:

SELECT
  ROUND(SUM(amount), 2) AS total_amount
FROM orders
WHERE created_at > '2012-01-01'

However, since MySQL 5 there are some better data types for storing exact money values.  From the MySQL manual:
"The DECIMAL and NUMERIC types store exact numeric data values. These types are used when it is important to preserve exact precision, for example with monetary data. In MySQL, NUMERIC is implemented as DECIMAL, so the following remarks about DECIMAL apply equally to NUMERIC."
The DECIMAL data type requires you to specify the number of digits before the decimal point and the number of digits after the decimal point.

So if your application needs to handle money values up to a trillion dollars (or euros or pounds), then this should work:
DECIMAL(13, 2)
However, considering the rate that the central banks are printing money these days, we might all soon need to be processing larger values for currency.

I hope this helps.

References

 
One Last Thing

One of the services my company offers is database conversion solutions.  Are you looking to convert spreadsheet data to a MySQL database?  Check out SQL Converter's new Expert Conversion Service.  The classic Excel macro is still available as well.