Tuesday, December 3, 2013

sed notes

I am a sed newb. Today I encountered usages of it that I want to note for future reference.

sed 's/^M//g' SHOP-684.sql > SHOP-684-noM.sql

Note: Hold the control key and then press v and m to get the control-m character.

This removes the ctrl-M's that litter Windows-saved files in a unix env. In a true unix env you can use dos2unix, but on a mac that command (and its counterpart unix2dos) are unavailable.

In order to replace ^M with newlines on a Mac, I had to do:

$ sed 's/^M/\    [ HIT ENTER ]
--- /g' SHOP-684.sql > SHOP-684-noM.sql

I have a giant file (almost 1 million lines) that I need to edit to remove tabs and other detritus to convert it into a usable SQL file. Opening it in intelliJ or vi takes a while, so it is great to be able to do this instead. A big plus: it returns almost immediately. It's very fast.

In order to replace tabs, since the version of sed on a Mac does not support \t in the left side of "s///", I used the control for it, which happens to be ^I. It looks like this when you hold the control key and press v and then i to get the control-i character:

sed "s/      //g" SHOP-684-noM.sql > SHOP-684-noT.sql

See also:



Wednesday, July 24, 2013

IFTTT test post!

I created a simple IFTTT recipe to sms me if there's a new post on this blog. Testing it out now.

This links (for me) to my personal IFTTT recipes: https://ifttt.com/myrecipes/personal

IFTTT stands for If This Then That.

Monday, April 22, 2013

How to find out if a domain name is a CNAME in Unix/Linux

$ host -t cname qadbrw01 
qadbrw01.cluster is an alias for va-qa-dbrw101.cluster.

Thursday, May 17, 2012

MySQL function GROUP_CONCAT and CAST

I have a table of emails that may have multiple entries for a contact and I wanted to join all the emails together before joining the emails table to the contacts in my select. My query looked something like this:

SELECT mc.contactid, ce.emailAddress as emailAddress
FROM merchant_contact mc
JOIN (
SELECT contactid, GROUP_CONCAT(DISTINCT emailaddress SEPARATOR ',') as emailAddress
FROM contact_email GROUP BY contactid
) ce ON ce.contactid = mc.contactid

It ran fine in dbVisualizer, but then when it ran as part of my Java app it returned values like

[B@2d7f2fae
[B@79135fd7
[B@66f95a5a


These looked something like pointer addresses, not the email values I was expecting.

After trying a few things that did not work, including adding group_concat as an sql function to my configuration (recommended here) I tried changing this:

query.addScalar("emailAddress", Hibernate.STRING);

to


query.addScalar("emailAddress");

to let Hibernate try to determine the type itself. Although this didn't fix it, I did get more information to work with, because it complained "No Dialect mapping for JDBC type: -4". Searching for this got me to this post on CodeRanch (which I find helpful from time to time) where the guy fixed his problem by CASTing it from an NVARCHAR to a VARCHAR. I tried a variation on this and it fixed my issue, so here is what I ended up doing:

SELECT mc.contactid, ce.emailAddress as emailAddress
FROM merchant_contact mc
JOIN (
SELECT contactid, CAST(GROUP_CONCAT(DISTINCT emailaddress SEPARATOR ',') AS char) as emailAddress
FROM contact_email GROUP BY contactid
) ce ON ce.contactid = mc.contactid

Posted here in case this helps someone going forward.

Monday, November 14, 2011

How to add up all numbers, one per line in a file

cat /tmp/foo | awk '{sum+=$1}END{print sum}'

(From Mike Masters, of course)

Friday, November 11, 2011

How to output the first line of each file in a directory


$ head -n 1 *


Sample output:
 $ head -n 1 *
==> Desktop <==

==> Development <==

==> Documents <==

==> Downloads <==

==> Dropbox <==

==> Environment <==

==> Library <==

==> Movies <==

==> Music <==

==> Pictures <==

==> Public <==

==> Sites <==

==> bin <==

==> current.html <==
Current IP CheckCurrent IP Address: 63.119.11.19

==> databases <==

==> my.cnf <==
[client]

(That's the directory structure of my home dir on my work machine.)
 

Wednesday, November 2, 2011

Notes on GROUP BY in MySQL

Here is a query I wanted to run, but I was concerned that the value of fat.rowsprocessed would not come from the same fat row as min(fat.processeddate).

select m.domain, ma.merchantacctid, ma.createddate, 
fat.rowsprocessed, min(fat.processeddate)
from merchant_account ma
join merchant m on m.merchantacctid = ma.merchantacctid
join ftp_audit_trail fat on fat.merchantacctid = 
    ma.merchantacctid
where fat.processeddate > ma.createddate 
and fat.rowsprocessed > 0
and ma.createddate > '2009-12-31'
group by fat.merchantacctid
order by domain;

My buddy Spencer pointed out that in standard SQL, if you use an aggregate function, then you have to include all the other fields you are selecting in the group by. It turns out that there is an extension to GROUP BY, and to HAVING, in MySQL that enables you to use them on a single field:

MySQL extends the use of GROUP BY so that the select list can refer to nonaggregated columns not named in the GROUP BY clause. This means that the preceding query is legal in MySQL. You can use this feature to get better performance by avoiding unnecessary column sorting and grouping. However, this is useful primarily when all values in each nonaggregated column not named in the GROUP BY are the same for each group. The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate.
11.15.3. GROUP BY and HAVING with Hidden Columns

I was afraid that the db would pick any value of rowsprocessed, that it would not come from the same row that the min(processeddate) is from.

select m.domain, ma.merchantacctid, ma.createddate, 
fat.rowsprocessed, fat.processeddate
from merchant_account ma
join merchant m on m.merchantacctid = ma.merchantacctid
join ftp_audit_trail fat on fat.merchantacctid = 
    ma.merchantacctid
where fat.processeddate > ma.createddate 
and fat.rowsprocessed > 0
and ma.createddate > '2009-12-31'
group by fat.merchantacctid
having min(fat.processeddate)
order by domain;

HAVING is what I wanted to use. It's also not standard SQL legal but the same MySQL extension enables this.

I ran both, exported the csv's, and diff'd them, and they gave identical results. But I suspect that was luck in this case, and that the first query would not be dependably unarbitrary. I am more comfortable with the second query, using HAVING.