I Lessen Data: 2011

Wednesday, August 17, 2011

Double-UTF

I've stored a bit of a snapshot of all the music I've liked by picking one song per album and putting them into a seasonal playlist on itunes. So I've got "2006 Spring", for example, which has about a dozen songs I liked to listen to in spring 2006.

But I've just stored these in itunes. Not only does that mean they're locked within the Apple Empire, they're also vulnerable to me losing my hard drive. So I wanted to get them into real text files. Luckily, itunes lets you export playlists. Unluckily, it's in some bizarre janky format, when I really just want to extract the artist, title, and album for each song. Simple python script to the rescue.

Ah, but even after deleting some of the crud, I was left with a file in a mash of file formats! See, I had pulled out artist, title, and album, then concatenated them with commas, then written that to a file. But I hadn't paid attention to encodings, so I had some UTF-16 characters, then some UTF-8 commas, then more UTF-16 characters. But Python has an easy answer: just read in the one file as UTF-16, specify that your output file is UTF-8, and within your script deal with strings and don't worry about encodings.

Tim Bray explains UTF-8, UTF-16, and UTF-32 clearly; this is something I probably should have thoroughly understood a while ago.
Evan Jones has a nice overview of how to use unicode in Python.

And here's my script:

#!/usr/bin/env python
import codecs

for filename in open("filenames.txt"): # next time I'll learn
# syntax for "for filename
# in current directory"
filename = filename.strip()
outfilename = "output/" + filename.replace(" ", "_")
outfile = codecs.open(outfilename, "w", "utf-8")
for bigline in codecs.open(filename, "r", "utf-16"):
lines = bigline.split("\r")
for line in lines:
parts = line.split("\t")
if len(parts) < 4:
continue
song = parts[0]
artist = parts[1]
album = parts[3]
linetowrite = "%s, %s, %s\n" % (artist, song, album)
outfile.write(linetowrite)

Wednesday, July 6, 2011

vim: search and replace like breathing

One great thing about vim is how you can search/replace with just a few keystrokes. Being able to search and replace at light speed makes you feel very wizardly. Here are some commands I've been using frequently to do so.

(Note that these are all regexes, and furthermore vim regexes, which differ slightly from other regex implementations, and I won't get into that.)

/foo(enter)
(just type this, in insert mode, and you're instantly at the next instance of "foo". Then hit "n" to go to the next instance of "foo".)

/foo\c(enter)
same as the above, but case insensitive. (I guess you could do :set ic first instead of the \c there, but I don't like doing things that leave state lying around if I don't have to)

:s/foo/bar/g(enter)
Replace all "foo"s with "bar" on this line only.

:%s/foo/bar/g(enter)
Replace all "foo"s with "bar" in the entire file.

:%s/foo/bar/gc(enter)
Replace all "foo"s with "bar in the whole file, but ask for confirmation at each one. I like this one a lot.

Wednesday, June 29, 2011

Tracking users via cookies

Disclaimer: this is a "work in progress" or a "I don't know if this is good" post.

I have a simple web app called Sea Salt that serves a javascript game. I'd like to keep track of users at a very low level- just track them across requests, and maybe track if they come to the site again the next day but that's not super important. I don't want to make them log in or anything.

And it has a splash screen. I want the following behavior:

- first time you come to /, you get "welcome to this app, click to start"

- if you click that, you go to /play and I create a User entry for you.

- if you come to / again, you get "you've already started. click here to continue OR if that wasn't you, click here to restart."

- if you restart, I create a new User entry for you.

Very simplified App Engine python server code:

application = webapp.WSGIApplication([('/', Intro),

('/restart', Restart),

('/play', Play), ...

Set up the URL mappings.

class Intro(webapp.RequestHandler):

def get(self):

cookie_id = self.request.cookies.get('sea_salt_id')

if cookie_id and User.get_by_id(int(cookie_id)):

#(render already_started.html)

else:

#(render index.html)

Pretty simple. If you go to /, first check the cookie. If your cookie corresponds to a real user, then you must have been here before. already_started.html contains links to /restart and /play. Otherwise, you haven't been here, so show you the splash screen, which has just a form that posts to /play.

class Restart(webapp.RequestHandler):

def get(self):

self.response.headers.add_header(

'Set-Cookie',

'sea_salt_id=-1; expires=Thu, 01-Jan-1970 00:00:01 GMT')

self.redirect('/')

If you to go to /restart, delete your cookie, and send you back to /.

class Play(webapp.RequestHandler):

def get(self):

cookie_id = self.request.cookies.get('sea_salt_id')

if not cookie_id or not User.get_by_id(int(cookie_id)):

self.redirect('/')

#(render game.html)

def post(self):

cookie_id = self.request.cookies.get('sea_salt_id')

if not cookie_id or not User.get_by_id(int(cookie_id)):

user = User.create()

user.put()

id = user.key().id()

self.response.headers.add_header(

'Set-Cookie',

'sea_salt_id=%d; expires=Fri, 31-Dec-2020 23:59:59 GMT' % id)

#(render game.html)

This is the trickiest. If you go to /play via a GET (like typing it in the address bar), either let you keep playing (if you've already started) or redirect you to /. If you go to /play via a POST, either let you keep playing (if you've already started) or create a user for you and then let you play. I think this is right, because GETs should be read-only while POSTs can write, right?

This all seems a little too complex for its own good, but it seems to work. If you have any better ideas (or if I've made any mistakes), I'd love to hear them. Thanks!

Wednesday, June 1, 2011

.vimrc: colorcolumn

80 character line? No problem!

" displays a red column at 80 characters
set colorcolumn=80

Thanks to this stack overflow post.

EDIT: you should probably surround it with an "if exists" to avoid annoyance if you port your .vimrc to another machine that has an older version of vim (colorcolumn is new in 7.3). Here's the syntax:

if exists('+colorcolumn')
set colorcolumn=80
endif

Thanks to this answer on that same stack overflow post.

Wednesday, May 25, 2011

tComment

Another nice thing in IDE's is auto-commenting and uncommenting text. It's surprisingly complicated when you consider that most languages have both line-style (//) and block-style (/* */) comments, and the particular characters used differ from language to language. So there's no one-liner to robustly comment things in Vim.

I just got tComment, though, and I like it. I've learned a couple things:
- installing vim plugins is often as easy as downloading a .vba file, opening it in vim, and typing ":so %".
- you can't remap the slash key. I'd like it to be the same as Eclipse, so ctrl-/ (or command-/ on mac) would toggle comments. But I don't think that can be done. (do correct me if I'm wrong.)

.vimrc: switching tabs in MacVim

I'm not yet all-command-line all-the-time. (ACLATT?) Sometimes I use, say, browsers. So I've been liking MacVim.

I also like how alt-tab (or command-tab) switches between windows, while control-tab switches between tabs in my browser window. When I found out that MacVim windows can have multiple tabs too, I wanted control-tab to work there too. Hence, the newest (MacVim-specific) addition to my .vimrc:

" In MacVim, you can have multiple tabs open. This mapping makes
" ctrl-tab switch between them, like browser tabs.
" I don't think it matters whether I use noremap or map, unless
" :tabnext gets bound to something else, which would be weird.
noremap <c-tab> :tabnext<cr>

Friday, April 29, 2011

Android: files on internal storage.

I'd only read/written to SD card files before. When you're writing to the SD card, you can use standard Java file I/O. But if you want to write to internal storage, you can't, because you're supposed to ignore where the files go.

You can do so with Context.openFileInput() or Context.openFileOutput(). When you do, the files will be in /data/data/your_project_package_structure/files/your_file. (yeah, that's two /data/s.) For example:
/data/data/com.dantasse/files/hello.txt

More details: official docs and someone's tutorial (which is a lot more useful).

Tuesday, April 19, 2011

Building a .vimrc: tabs

Just nuked my hard drive and am explicitly pulling data back from my Time Machine as I need it. I could grab my old .vimrc, but I think I'll build a new one instead. (the old one wasn't very big.)

First things first: tabs.

" whenever I hit the tab key, it now inserts spaces instead.
set expandtab

" whenever I hit << or >>, it now indents this many spaces instead.
set shiftwidth=2

" tabstop is how many spaces a tab character looks like.
set tabstop=2

Note that if you set expandtab, and if you never open a file that already contains tab characters, you wouldn't need tabstop.

More info: http://tedlogan.com/techblog3.html

Friday, April 8, 2011

Null considered harmful?

So I subscribed to Gary Bernhardt's Destroy All Software screencasts. I've only watched the demo one ("How and why to avoid nil"), but I think it's pretty good! (disclaimer: I know him IRL; un-disclaimer: he's very good at software; in conclusion, you ought to subscribe too)

The message is pretty simple: avoid nil/null. And I dig the justification: someday you'll get a traceback that says NullPointerException (forgive me for translating to Java; I used to work for a big company), which tells you where someone tried to dereference null, but it doesn't tell you where the null came from.

But that frustrates me! In a lot of cases, null makes sense. For example, if you're getting something from a map. If "foo" is not a key in your map, map.get("foo") should return null! Otherwise, if map.get("foo") throws an exception, you have to do all this gross exception handling. And not only is it gross, but you're using exceptions for control flow, which is a Big No-No.

Or, okay okay, I'm learning python, you always say:
if "foo" in map:
map["foo"]
Seems uglier than it needs to be. You're asking for "the value that corresponds with foo" and sometimes there is none.

Nevertheless, I've heard the null pointer referred to as a terrible idea; for example, here. So I'd like to try writing a whole project without nulls and see what happens.

Monday, January 3, 2011

Packages and Android internal data

1. Packages: there are two concepts of "package" on Android. Here's a great overview. Make a new Android application package for each project, don't change them if you can avoid it. In my case, I had originally started on my current project, "How are you right now?" with the package com.dantasse, but that leads to two problems:
- I couldn't install multiple apps with the Android application package com.dantasse on the same phone at the same time
- I couldn't publish multiple apps with the Android application package com.dantasse in the Android market at the same time.

2. Android internal storage: it's pretty opaque. The official Android docs are pretty good at telling you how to access both internal storage (on the phone itself) and external storage (SD card). Internal storage is easy to use, but AFAICT it's not viewable via the standard filesystem. (I read somewhere that it goes in /data/data/(your app) , but I tried using Astro file manager to browse through it on the phone and couldn't find it. And I wasn't able to mount the internal data on a computer, just the SD card.)
There might be something you can do with content providers or something to get this internal data out of your app, but I don't know it.