среда, 10 сентября 2014 г.

Tilde key on Mac Air with Ubuntu




How to get backtick (`) and tilde (~) symbols in Ubuntu installed on Macbook Air with EU keyboard (instead of backslash (\) and pipe (|) symbols that show up by default):

1. Run xev and press the tilde key. Find the keycode assosiated with this key in the output.

2. Change or create file ~/.xmodmaprc and add the following text to it:

keycode <keycode from xev output> = grave asciitilde 

3. Run: xmodmap ~/.xmodmaprc

The method is taken from here http://stackoverflow.com/questions/17757232/switch-tab-and-backtick-keys-ubuntu-linux

среда, 22 января 2014 г.

How to read XML file into pandas dataframe using lxml

This is probably not the most effective way, but it's convenient and simple.

Let's pretend that we're analyzing the file with the content listed below:

<xml_root>

    <object>
        <id>1</id>
        <name>First</name>
    </object>

    <object>
        <id>2</id>
        <name>Second</name>
    </object>

    <object>
        <id>3</id>
        <name>Third</name>
    </object>

    <object>
        <id>4</id>
        <name>Fourth</name>
    </object>

</xml_root>

First, we need import lxml objectify

from lxml import objectify

Then, open the file:

path = 'file_path'
xml = objectify.parse(open(path))

Get the root node:

root = xml.getroot()

Now we can access child nodes, and with
root.getchildren()[0].getchildren()
we're able to get the actual content of the first child node as a simple Python list:

[1, 'First']

Now we obviously want to convert this data into data frame.

Les's import pandas:

import pandas as pd

Prepare a empty data frame that will hold our data:

df = pd.DataFrame(columns=('id', 'name'))

Now we go though our XML file appending data to this dataframe:

for i in range(0,4):
    obj = root.getchildren()[i].getchildren()
    row = dict(zip(['id', 'name'], [obj[0].text, obj[1].text]))
    row_s = pd.Series(row)
    row_s.name = i
    df = df.append(row_s)

(name of the Series object serves as an index element while appending the object to DataFrame)

And here is out fresh dataframe:

  id    name
0  1   First
1  2  Second
2  3   Third
3  4  Fourth


Full source code:

from lxml import objectify
import pandas as pd

path = 'file_path'
xml = objectify.parse(open(path))
root = xml.getroot()
root.getchildren()[0].getchildren()
df = pd.DataFrame(columns=('id', 'name'))

for i in range(0,4):
    obj = root.getchildren()[i].getchildren()
    row = dict(zip(['id', 'name'], [obj[0].text, obj[1].text]))
    row_s = pd.Series(row)
    row_s.name = i
    df = df.append(row_s)