Santa Clara, California

IMG_20120114_110053

So it has come to my attention that, I haven’t written a post about California, and I have been here for 3 weeks now.

In short, it’s absolutely AWESOME!

All the stories I heard about California weather, turned out to be all true. Clouds are a rare sight, and there has only been 1 raining period (3 days of intermittent very small rain) in the past 3 weeks. All other times it was sunny and cloudless. This is the coldest time of year, and it’s about 15C outside. I can still walk down the street in a t-shirt on sunny days. I wonder what summer will be like, but luckily my place has AC, so I’m not TOO worried.

The trip here and settling in was a little hectic, and the fact that it was my first time living by myself far away from home, in a new place, have 8 hours jet lag (I flew in from Taiwan a day ago) and that I have to work the next day, didn’t really help.

The first day was just running a bunch of errands, on foot, because I didn’t realize how bad public transit in Santa Clara is. For the record, it is really bad. I’m not sure if it was because people here are too rich to take the bus or what, but the public transit system is nowhere near as convenient as in Vancouver. Bus stops are few and far between, and they come once or twice an hour for the most part. I ended up logging about 12km on foot on that day. Wasn’t really that bad, since I am used to long distance walking/running, but still, a better mode of transportation would be desirable. I did get a good look of the city, though. Residential area is definitely rich. All the houses look really nice, like what you would see in movies, in upscale American communities. The people are generally nice, but I did encounter a few shady people, though I guess that’s the same in Vancouver.

Then work started at NVIDIA. It has been relatively uneventful. We are doing some post-silicon work on Kepler GPUs (which will be absolutely AWESOME!!). Got to play with a few really cool toys, like a 12.5GHz oscilloscope that costs more than my house, with a 10GHz active differential probe that costs more than my car. I also witnessed a lab technician soldering a 1700 balls BGA chip in about exactly 37 minutes (and then de-solder it… for unspecified reason). I wish my job could be more creative, but money is good, and I guess they generally don’t let interns do GPU architecture.

Typical work day is from 10am to 7pm, because NVIDIA offers subsidized dinner starting at 7, so everyone stays till 7. Totally conspiracy.

Luckily my place is within walking distance to work, so I don’t need to worry about transportation on weekdays. Weekends, though, is a little more problematic. Especially since I have to go to the airport across the city every week to do flight training. What we decided on is to rent a car on weekends to get around. It’s about $60-70, depending on season, per weekend, which is not too bad, since I’m splitting the cost with 2 other people. In comparison, renting a compact car for a month, plus insurance, is about $600. Technically, all car rental companies charge a young renter fee of about $15/day for people between 20-25. In reality, at least for Hertz, that’s regularly waived by a promotion that’s always on.

A few things I realized -

  1. Sliced bread is the best invention after Google Maps + GPS. I got 3G phone plan before food on my first day.
  2. Yelp is really useful when you are in a new city, and need to find some place to eat.

After the first week, things became less chaotic, and I started my flight training!

There are so many places to go to around here – San Francisco is about 45 minutes away, through Palo Alto, Mountain View, Sunnyvale, basically across the silicon valley. Then there’s Los Angeles and Las Vegas that are about 5 hours away (or 2 hours and a plane, if you have a pilot license). So far we have only been to San Jose and Palo Alto, and only to Walmart and IKEA, respectively. Sacramento, capital city of California, is also only 2 hours away. A lot of exploring to do in the next 7 months!

Overall, the city feels very positive. I’m sure the weather plays a big part, but also the people. It’s in the middle of the Silicon Valley, so the people here are pretty much all rich, happy, and well educated engineers. It’s a place that makes you feel like engineering is the best profession in the world. If you are an engineer with good skills, you can get a good life here. There are opportunities everywhere, and the only thing stopping you is what you are capable of. Of course, that also means, if you don’t have good skills, you won’t find a job because good engineers are everywhere.

For future reference, when people here (in Santa Clara/San Jose) say “the city”, they are referring to San Francisco. For a lot of people, Silicon Valley is where they work on weekdays, and SF is where they go to have fun on weekends (if you are into unhealthy drinking, crazy clubbing, etc…). Very interesting place!

Posted on February 2nd, 2012 by matthew – Be the first to comment

First Flying Lesson – “At least one of us knows where the airport is”

Ever since my introductory flight at the Pacific Flying Club at the Boundary Bay Airport, I’ve always been looking for an opportunity to continue on and get my private pilot license. For those that aren’t familiar, a PPL allows you to fly pretty much any single engine propeller plane for non-commercial purposes, in good weather. Takes about 50-60 hours of flight time, and variable number of hours of ground instruction. Usually takes about 5 months if you fly 3 times a week, etc.

I didn’t really have time when I was in Vancouver, but I got a co-op work term in Santa Clara, California, so I’ll be here for 8 months, and it seems like a perfect opportunity, because

  • Flying in California is much cheaper, mostly due to lower fuel cost. About $7000 everything, compared to >$10000 in Canada.
  • I can only fly in good weather (high cloud ceiling, slow wind), and California weather is certainly a lot nicer than Vancouver.
  • I live 15 minutes away from the airport.

An American license can be easily transferred into Canadian license by doing a short written test.

In the San Jose area, the most prominent general aviation airport is KRHV (Reid-Hillview Airport), and there are about 5 flight schools there.

I did a lot of research, and decided on Aerodynamic Aviation (http://aerodynamicaviation.com). It’s a flight school with long history, and very good reputation.

Today I took the first lesson! It was a lot of fun. Spent about 2.5 hours, 0.8 hours of which flying (rest was paperwork, inspections, talking, etc).

The plane was a Champion Citabria. http://en.wikipedia.org/wiki/American_Champion_Citabria

(Wikipedia’s picture. I would have taken pictures if my phone didn’t die on me)

It’s pretty old, and definitely not high performance, but it was more fun to fly (IMHO) than the more common Cessna 152/172, since it was designed for aerobatics. Very low stall speed, and no flaps (which was a little unusual). It’s also tailwheel, instead of the more common and easier tricycle gears. The cockpit view is a lot better than a Cessna. I could see EVERYTHING.

I was planning to use a Cessna 172, but I’ve decided to use the Champion instead, because the instructor says it usually only takes about 3 additional hours (tailwheel landings are more difficult), and Cessna 172s are harder to book because they are used for instrument training. Plus, who can say no to a $85/hr airplane…

The CFI (flight instructor) was a nice guy. Very thorough explanation of everything from instruments, to aerodynamics, to airplane components. We took around 20 minutes on pre-flight inspections, because he tried to point every little detail to me, plus his own experiences (what are the common things that break, and how they break), which is really cool. During the 50 minutes flight, we did turns, climbing turns, descending turns, and a lot of miscellaneous stuff (throttle control, mixture, etc). Instruction was awesome, and I got plenty of chances to practice until I felt pretty comfortable in those maneuvers.

We climbed to 5000′, went to the practice area, and just did a lot of turns…

https://maps.google.com/maps/ms?msa=0&msid=213678659607032853366.0004b713d3de3ad39857d

(Why does it look like a one way trip? Because my phone ran out of battery)

I looked him up afterwards, and apparently he is an ATP/CFII/MEI with 7000 hours experience! I’m not sure why they gave me such a “high end” instructor when I’m only doing PPL, but I’m not complaining. The name is Rich Digrazzi. Highly recommended.

Looking forward to my next lesson next week. Will definitely need to schedule more lessons if I want to finish this in 7 months, though.

Posted on January 22nd, 2012 by matthew – Be the first to comment

How to Get Skype to Work on 64-bit Linux

I have been using Linux for about 7-8 years now, using Red Hat 9 back then, before it went commercial. The free fork, Fedora is at version 16 now (I used it till version 3 I think). Makes me feel very old.

It’s really amazing to see how much Linux has grown since then. Back then, Linux has just started targeting desktops. Before that, Linux was widely used in servers and embedded systems, but that was about it.

One thing that I notice the most improved is audio support. I still remember in RH9, most consumer audio chips aren’t supported, and I actually had to go hunt down a PCI sound card with Linux driver. Even then, there were configuration files to write, driver settings to set, and only 1 program can output sound at a time (it was when ALSA was just adopted, and without dmix I believe), unless you have an expensive sound card that supports hardware mixing.

Nowadays, audio works perfectly out of the box with all machines I’ve installed Linux on. For the most part, no configuration is needed at all.

Until of course, Skype comes along. It’s a proprietary closed-source thing with some brainfuck design (that will become apparent later). Unfortunately, I have to use it to talk to my grandparents, since they don’t know how to use computers, and have this standalone Skype handset thing…. long story. So I had to get it to work.

So I went to the Skype website, and conveniently, they have a deb package, that miraculously installed and runs without segfaulting.

Everything worked fine, except it can’t get any audio input at all. So I went into Skype settings, and see that all input/output devices are set to “pulse server (local)”, which is good. That means there must be something wrong with pulse, so I opened the built-in media recorder thing, which uses PulseAudio, and surprisingly it records with no problem at all.

So it’s a problem between Skype and PulseAudio. I couldn’t get it to work.

I first tried removing PulseAudio, but that doesn’t really work, because GNOME 3 is so deeply integrated with PulseAudio, that I won’t be able to get volume control and hotkey support for ALSA.

That’s fine, I’ll just get Skype to use ALSA directly, and have everything else still go through PulseAudio. That’s not as easy as I thought, because Skype, for some bizarre reason, will only use PulseAudio if it finds PulseAudio running. There is no option to override that.

The solution I found, is to remove 32-bit version of libpulse, so that 32-bit programs (like Skype) won’t be able to use PulseAudio. On Debian, the 32-bit library is in the same package as the 64-bit library, so I had to do some dirty work.

sudo rm /usr/lib32/libpulse*

Bye, 32-bit libpulse!

Then restart Skype, and configure it to use the correct ALSA devices.

Everything works now, except you can’t initiate or take a call when there’s another program using Pulse, since Pulse hogs ALSA exclusively.

The solution is to configure Pulse to go through dmix instead. dmix is ALSA’s primitive software mixer, that allows cheap sound cards to output multiple channels.

It’s a design decision made by Debian and Ubuntu to have PulseAudio go to the hardware directly instead of through dmix, because PulseAudio has its own mixer, and cascading mixers is bad for latency. If you do serious audio work, you’ll probably not want to do this. I don’t really care. I didn’t notice any difference with youtube videos.

In /etc/pulse/default.pa

load-module module-alsa-sink device=dmix

And comment out the automatic hardware detection line

#load-module module-udev-detect

That’s all! Everything should work perfectly at the same time.

Posted on December 31st, 2011 by matthew – 1 Comment

DIY Network-Attached Storage on Linux

It’s nice to have a NAS at home for file sharing and automated backup, but commercial NASes are expensive for their performance, and not as flexible as a real PC.

This is a guide (mostly as a memo for myself) to build a Linux-based NAS from scratch, both hardware and software, and assumes the reader has some understanding of PCs (knows how to assemble them, or select a pre-built one with specifications), and at least some Linux knowledge.

The end result will be a NAS that is accessible over CIFS (”Network Neighbourhood”) and SFTP (useful for accessing over the internet), provides snapshot incremental backups (time travel!!), and runs on some kind of RAID.

There are nice commercial units that can do all those things, such as the Synology DS-211j, but they are slow for their price, and you can’t re-purpose them after end of life, or make it do other non-NAS things (running game server, etc).

However, if like me, you have some time on your hands, and skills, you can build a very high performance NAS for cheap. Even better if you have an old PC lying around – it could be free!

Hardware

If you have a PC lying around, with SATA ports, you are all set. IDE is possible, too, but then assuming you don’t have big enough IDE drives, you’ll have to buy them. And I don’t recommend buying IDE drives, because they are obsolete already, and no new computer can use those drives. If you do already have huge IDE drives, then by all means.

Otherwise, if you are like me, without a usable old computer (the P3 I’ve been using as a NAS for the past 5 years just decided to commit suicide, though of course, I was able to get all the data out), it’s shopping time!

Harddrives

First of all, you’ll need to decide how many harddrives you want, which is determined by how much space you need, and what level of redundancy do you want (http://en.wikipedia.org/wiki/RAID). All standard RAID levels are supported by Linux’s software RAID implementation, which is what we will be using. I don’t recommend going with a single-drive system, because if the harddrive fails, and I’ve personally had 3 fail on me, it doesn’t matter how much backup you have on that harddrive…

Speed also doesn’t matter too much, because network speed will most likely be the bottleneck. Definitely don’t use SSDs in a NAS. That’s epic waste of money for nothing.

In this example, I will be using 2×1TB WD Caviar Green drives I had in my old server, in RAID-1 configuration.

Motherboard + CPU + RAM + PSU

I recommend an Intel Atom (as of this writing), because CPU speed is not important unless you are doing RAID-5 (in which case it’s marginally important). Make sure the motherboard has enough SATA ports for your setup. I am using Intel D525MW. Intel boards are known for their reliability, and I really like the fanless design, because fans do fail once in a while, and heatsinks never fail. Fans also increase dust build-up, increasing need for maintenance.

Memory is also not very important. 256MB will probably do, but you can’t buy anything that small nowadays. I’ll be using 2×2GB DDR3 sticks from my old laptop (D525MW uses laptop RAM). More free memory for caching will increase performance somewhat, because Linux will cache your most used files in RAM, and serve them lightning fast when you need them again, but again, not so much in a NAS, because network is relatively slow, though it should help if you need to access many small files.

Power supply is important. Pick one with good reviews. Anything higher than 200W will do, unless you are planning to have a 20 HD array or something, in which case you wouldn’t be reading this article because you’ll know what you are doing more than I do already. PSU is the one thing you shouldn’t cheap out on!!! A good PSU fails by shutting itself down. A bad PSU fails by sending 200V into your components, and optionally set your house on fire. You want the former. Anything from Antec, Corsair, or Seasonic should be fine, but check reviews online. I will be using a Corsair 430W (CMPSU-430CXV2).

If the motherboard you picked doesn’t have gigabit ethernet, you’ll definitely want to pick-up a gigabit card. They are very cheap nowadays and will make your NAS 5-6x faster. Intel cards are known to work very well in Linux, and are very fast.

Total cost I got (NCIX with aggressive price matching):

  1. D525MW – $79
  2. Corsair 430W PSU – $40
  3. 4GB USB stick to install the OS on (to keep things simpler, I like to have my data drives dedicated to data, and have the OS somewhere else) – $5

Total is $125 without harddrives, but this depends on what hardware you have lying around that you can reuse.

Network

If your home network is already gigabit, you are all set. Otherwise, you’ll have to decide if you want to upgrade or not. With 10/100mbps network, typical large file transfer performance is around 10MB/s. On a gigabit network 50+MB/s is typical, and 70-80MB/s is definitely attainable with some optimization. It’s up to you, but I would definitely try getting a 1gbps network running.

To run a 1gbps network, you need

  • 1gbps card on all computers that you want 1gbps on. It’s possible to have a mixed environment, and communication between 1gbps computers will be 1gbps.
  • Cat 6/5e cables. They look and work exactly like regular Cat 5 network cables, except they are built to higher standards to guarantee 1/10gbps operation. If you are lucky, 5 will work at 1gbps, too, but that’s not recommended and may not be reliable, though sometimes you don’t have a choice (if the cable is in walls).
  • A gigabit router or switch. For some reason, gigabit wireless routers are still very expensive, even though gigabit switches are dirt cheap. If you already have a 10/100 router, you can just add a gigabit switch behind it, and connect all your computers to the switch. This way, you’ll have a 10/100 internet connection (which is fine, unless you have 100+mbps internet), and 1gbps within your LAN.

My victim:

IMG_20111215_204123

Setup overview

I initially planned to run FreeNAS (an open source NAS system based on FreeBSD), but after evaluating it in a virtual machine, I don’t think I really want to trust it with my data, yet. It just underwent a big rewrite after getting taken over by a company, is in a huge mess, and a bug as big as “email subsystem doesn’t work at all” slipped past their QA and into the stable 8.0 release. I also encountered a few bugs in just 10 minutes of testing.

I decided to go back to good old Linux instead. For our purpose, practically any distribution will work. I picked Ubuntu Server 10.04 LTS because I’ve had some experience with it before, and LTS status means it will be supported till 2015 (updates, etc). This guide should be fairly independent of distro.

It will be installed on a bootable 4GB USB drive that’s permanently plugged in, and the 2 1TB data drives will be in RAID-1, providing a total of 1TB space. It will be divided into 2 CIFS/Samba/”Network Neighbourhood” shares, 1 for my HD porn collection stuff, and one for my parents’ office documents.

All the data will be stored on a ext4 partition, on the RAID volume. Snapshot backup will be set up to provide rotating backup every few hours using the rsnapshot utility (based on rsync and hardlinks).

There’s an advanced filesystem called ZFS that has built-in RAID and snapshots, but Linux support for ZFS is only through a barely maintained FUSE driver, so probably not a good idea for a production system. Btrfs is Linux’s answer to ZFS that has most of the same functions, but it’s still experimental, so also no. ReiserFS 4 was pretty promising, too, until the main developer got thrown into jail for murdering his wife… Ext4 is pretty fast, well tested, and well supported (filesystem utilities) in case something bad happens. If you will mostly store huge files (disk images, movies), XFS may give you higher performance.

Software Setup

After assembling your victim, and installing your Linux distro of choice (just do a basic install, we will do all the drive preparation and partitioning later)… At this time you’ll need a monitor and keyboard attached to the NAS.

Network setup
First we have to set up networking. If you didn’t set it up during installation, it probably defaults to DHCP. DHCP is bad because that means your server’s IP will change all the time. You can set static IP in /etc/network/interfaces (on Debian/Ubuntu at least), and remember to set your router to exclude that IP from the DHCP pool, otherwise it may hand out this IP to another computer, and bad things will happen.

Here is my /etc/network/interfaces (note the weird address for router. I have a weird router. Yours is probably at .1)

# This file describes the network interfaces available on your system
# and how to activate them. For more information, see interfaces(5).

# The loopback network interface
auto lo
iface lo inet loopback

# The primary network interface
auto eth0
#iface eth0 inet static
iface eth0 inet static
address 192.168.1.50
netmask 255.255.255.0
gateway 192.168.1.254

Then do

ifdown eth0
ifup eth0

to reset the network interface (if you are doing this over ssh, make sure you type them in one line… “ifdown eth0 && ifup eth0″ for obvious reasons).

Make sure the interface is running at 1000 mbps, full duplex

matthew@nas:~$ dmesg|grep eth0
[ 1.713278] e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection
[ 15.115988] ADDRCONF(NETDEV_UP): eth0: link is not ready
[ 15.145697] e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
[ 15.146953] ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[ 25.840015] eth0: no IPv6 routers present
[ 229.880116] e1000: eth0 NIC Link is Down
[ 323.301869] e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX

Everything from this point on can be done over the network with SSH!

Disk preparation

Since the disks are brand new (or only has data you don’t care about), now would be a good time to do a destructive read-write test to make sure the media is good. This is especially important if you are using old harddrives. Not required, just recommended.

In my case, my data drives are /dev/sda and /dev/sdb

badblocks -w -v -s /dev/sda
badblocks -w -v -s /dev/sdb

They can be run at the same time. Will take a few hours (7 hr for my 1TB) depending on your drives. Make sure no errors are reported.

Then, we can partition the data drives. They will each get a huge partition with RAID type. It’s a special partition type that tells mdadm (Linux’s RAID manager) the partition is part of an array.

Any Linux partitioning program will do. fdisk, cfdisk, parted, etc. Note that if you have an “Advanced Format” (4K sectors) disk like I do, you’ll want to make sure the program you use properly aligns your partition to multiple of 8 sectors, since AF drives lie to the OS that they have 512 bytes sectors for backward compatibility (http://wdc.custhelp.com/app/answers/detail/a_id/5655/~/how-to-install-a-wd-advanced-format-drive-on-a-non-windows-operating-system).

matthew@nas:~$ sudo fdisk /dev/sda
Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel
Building a new DOS disklabel with disk identifier 0xcacafadd.
Changes will remain in memory only, until you decide to write them.
After that, of course, the previous content won’t be recoverable.

Warning: invalid flag 0×0000 of partition table 4 will be corrected by w(rite)

WARNING: DOS-compatible mode is deprecated. It’s strongly recommended to
switch off the mode (command ‘c’) and change display units to
sectors (command ‘u’).

Command (m for help): u
Changing display/entry units to sectors

Command (m for help): p

Disk /dev/sda: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders, total 1953525168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0xcacafadd

Device Boot Start End Blocks Id System

Command (m for help): n
Command action
e extended
p primary partition (1-4)
p
Partition number (1-4): 1
First sector (63-1953525167, default 63): 64
Last sector, +sectors or +size{K,M,G} (64-1953525167, default 1953525167): +1953525096

Command (m for help): p

Disk /dev/sda: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders, total 1953525168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0xcacafadd

Device Boot Start End Blocks Id System
/dev/sda1 64 1953525160 976762548+ 83 Linux

Command (m for help): t
Selected partition 1
Hex code (type L to list codes): FD
Changed system type of partition 1 to fd (Linux raid autodetect)

Command (m for help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.
Syncing disks.

Same for the other disk.

Then we can finally create the RAID array -

sudo mdadm –create /dev/md0 –level=mirror –raid-devices=2 /dev/sda1 /dev/sdb1

mdadm: array /dev/md0 started.

It will take a while to rebuild, but the array is usable in the mean time. It will just use idle IO bandwidth to rebuild.

“mdadm –detail /dev/md0″ will tell you all you need to know about your array.

matthew@nas:~$ sudo mdadm –detail /dev/md0

/dev/md0:

Version : 00.90

Creation Time : Fri Dec 16 13:02:58 2011

Raid Level : raid1

Array Size : 976762432 (931.51 GiB 1000.20 GB)

Used Dev Size : 976762432 (931.51 GiB 1000.20 GB)

Raid Devices : 2

Total Devices : 2

Preferred Minor : 0

Persistence : Superblock is persistent

Update Time : Fri Dec 16 13:03:05 2011

State : active, resyncing

Active Devices : 2

Working Devices : 2

Failed Devices : 0

Spare Devices : 0

Rebuild Status : 0% complete

UUID : 20389aa5:944e1839:c7780c0e:bc15422d (local to host nas)

Events : 0.3

Number   Major   Minor   RaidDevice State

0       8        1        0      active sync   /dev/sda1

1       8       17        1      active sync   /dev/sdb1

In this case, the array is still doing the initial sync.

Then,

root@nas:~# mdadm -Es >> /etc/mdadm/mdadm.conf

To make sure the array will be automatically recreated on reboot.

Make sure performance from the array is reasonable -

root@nas:~# hdparm -Tt /dev/md0

/dev/md0:

Timing cached reads:   1776 MB in  2.00 seconds = 888.24 MB/sec

Timing buffered disk reads:  284 MB in  3.02 seconds =  94.10 MB/sec

Create an EXT4 (or your choice of) filesystem on the RAID device, and set journal to writeback (higher performance, potential loss of recent data on power loss)

root@nas:~# mkfs.ext4 /dev/md0

root@nas:/data# tune2fs -o journal_data_writeback /dev/md0

Add the mount point to /etc/fstab

/dev/md0        /data           ext4    noatime,noexec,data=writeback   0       2

You’ll probably also want to set up mdadm to email you when one of your disks failed -

http://ubuntuforums.org/showthread.php?t=1185134

Sharing

Then we can set up the CIFS shares. Assuming we have 2 directories, /data/user1 and /data/user2, to be shared to user1 and user2 respectively.

First we need to create the users -

adduser user1

adduser user2

Then make them own the data directories

chown -R user1:users /data/user1

chown -R user2:users /data/user2

And in /etc/samba/smb.conf, add

security = user

And

[user1]

path = /data/user1

browseable = yes

read only = no

create mask = 0700

directory mask = 0700

valid users = user1

For each user.

Then the share should be accessible from Windows and Linux. From Windows, it will be “\\[server ip]\user1″, and can be mapped to a network drive.

If the performance is not satisfactory, there are many samba settings you can try, but it becomes a bit of black art, so I won’t cover that here. I get about 70-80MB/s (both reading and writing) out of the box, sending a big iso file from my computer, so I’m not going to bother optimizing it. Theoretical maximum is 125MB/s on gigabit, but with protocol overheads, etc, the practical maximum is probably somewhere around 90MB/s. The defaults are pretty good.

Snapshot backup

Snapshot backups allow you to basically “look back in time”, and see a copy of everything an hour ago, 2 hours ago, 2 days ago, etc. Obviously, the naive approach would take way too much space.

However, most files will be the same between snapshots, and we can exploit it to save space. For files that aren’t changed, we just need to hardlink them to the original (a hard linked file looks and feels exactly like a duplicate of the original, except they actually refer to the same data on harddrive). For big files where only small parts are changed, rsync allows us to only store differences.

This way, for example, when you look into the hourly backup directory, you’ll see a directory for 1 hour ago, a directory for 2 hours ago, etc, and while they all look like real hard copies, they only take up the space of differences between then and now.

This sounds hard. And it is, if you have to implement this yourself. Fortunately, there’s a program called rsnapshot that will handle everything for you.

So install rsnapshot using your favourite method (apt-get on Debian/Ubuntu), and edit /etc/rsnapshot.conf to set up your backup. The file is very straight forward, so I won’t talk about that.

It’s better to do the backups to an external drive or another machine, but I’m just putting it on my /data partition. Remember to use the “exclude” option to exclude things you don’t want backed up (disk images, etc). I personally use a “nobackup” directory under my share to store files I don’t want backed up. In my case, I have 384GB in total, but only 11GB or so are really important and need to be backed up.

The backup directories can be shared via CIFS if you want. Though it’s probably a good idea to make that read-only, since writing to a snapshot will corrupt other snapshots (because they are all linked). Can’t change history…

In the end, you’ll want to have cron execute the backup jobs automatically. For example, add to /etc/crontab -

0 */4 * * *       /usr/local/bin/rsnapshot hourly

30 23 * * *       /usr/local/bin/rsnapshot daily

This is for hourly = every 4 hours, and keeping daily snapshots.

That’s it! Next time you accidentally delete or change something, just go pick it up from the last snapshot. The maximum amount of work you’ll lose depends on the interval setting. In my case, that’s 4 hours.

Internet access

On the server side, nothing needs to be done (except installing OpenSSH, if it’s not installed by default and you haven’t installed it). Internet access can be done over SSH (SFTP). You’ll also need to set up port forwarding in your router (port 22, TCP), and maybe get a dynamic domain name (eg. dyndns.org) so you don’t have to remember your IP.

On Linux, sshfs can be used to mount a remote filesystem – “sshfs username@yourserver:/data/user1 mount_point”

On Windows, there is a commercial program called ExpanDrive that’s pretty good but also pretty expensive (has trial). Any SFTP client will do.

On Mac, I have no idea. Sorry.

It’s possible to use SSH over LAN, too, and not worry about Samba. However, from my testing, SSH performance on gigabit network is very bad, probably due to the mandatory encryption, especially on low power CPU. That doesn’t matter over the Internet because you’ll be limited by Internet speed anyways. Samba, on the other hand, is not suitable for internet, because it’s very latency-sensitive (small packets, wait for ack, etc).

Not covered: SMART monitoring. I personally find it pretty useless because false positives and false negatives are both too high. I just rely on the RAID for hardware integrity. If you want to use it, Googling “smartmontools” would be a good start. Also – offsite backups. You’ll want to do that for very important data, but not practical in my case.

That’s all! Happy storing! (until you receive an email telling you a disk is failing)

Posted on December 17th, 2011 by matthew – 2 Comments

Interview Frenzy 2

This post has been sitting in my draft folder for a very long time, I forgot about it, but here it is -

Third round of job searching. I didn’t get nearly as many interviews as I had hoped, probably due to sloppy resume, and I’m also a lot more selective this time, applying only to hardware positions (too much software for me, need to take a break… life is about balance!). I did get some really cool interviews, though.

nVidia (System Engineer, post-silicon verification)

Very intense interview! An entire hour of quizzing, on everything from C++ to analog circuits (filter transfer function). Most of them are fairly standard, though, and programming part is easy (I guess they don’t really expect electrical engineers to know how to program). I can’t believe I actually programmed in C++ and VHDL over the phone.

  • How to construct a NOR gate using 1 input multiplexers? (LUT, cascaded muxes)
  • How to construct a D flip flop with async reset from regular DFF (I’m still not sure. Mux on output and input?)
  • Merge 2 files containing lists of words, remove duplicates, sort, and output to third file (binary search tree, pre-order traversal)
  • Given a series RC circuit, determine response at DC and 1MHz (1 pole transfer function)
  • 100 students took an exam. mean score = 500, SD = 100. Highest 15% pass. Is 650 pass? (basic stats)
  • With a 3L cup, a 5L cup, and infinite supply of water, measure 4L (classical interview question. somewhat hard. did a depth-first search in my head)

nVidia second interview

This one is all on digital logic, still fairly difficult. Talked in detail about the PCB I made for robotics lab, and firmware programming.

  • How to construct a D-latch using transmission gates and inverters (digital feedback, bus contention)
  • What is the delay of the latch (add up delays of signal path)
  • How to construct a D-flipflop using 2 D-latches (in series, one gets inverted clock)
  • What are the parameters (setup time, clock-to-Q) of the D-FF (delay of first latch, and delay of second latch)
  • 2 D-FF with combinational logic between them. What is the timing constraint given delay of the logic circuit, setup time, clock-to-Q, clock delay to second FF, and clock jitter (clock jitter is the hard part, 2tj must be added because in worst case, 2 clock edges can be T/2 + 2tj apart)

Nuvation (firmware developer)

Mostly talked about projects I have done.

  • on a microcontroller, when is it appropriate to use interrupts, when is it not? (responsiveness, external stimuli, easy to introduce bugs if interrupt handler shares data with main loop, etc)
  • how to test software (whitebox, blackbox, edge cases, typical cases, etc, big words, pretend to be a software engineer)

Nuvation second interview

Again mostly projects I have done, and some project management stuff. How I organize the electrical team at Thunderbots, etc.

  • how to implement brushless motor controller in FPGA and MCU (probably because I mentioned brushless motors in my resume)
  • some simple questions about PCB design that I don’t remember. Something about vias and component packages…

Sifteo (Electrical engineer)

Mostly talked about my involvement with Thunderbots, and PCB design experiences, and what is it like to work at Sifteo. They make very cool stuff!

  • how to bring up and test a new circuit (different approaches, top-down, bottom-up, etc)

I really liked the Sifteo position, but I ended up choosing NVIDIA because it’s my first work term in the US, and working for a big company simplifies things (visa, housing, etc). And California!!!

Posted on October 30th, 2011 by matthew – 2 Comments

10,000 Hours Rule (Outliers: The Story of Success)

“Ten thousand hours of practice is required to achieve the level of mastery associated with being a world-class expert – in anything … composers, basketball players, fiction writers, ice skaters, concert pianists, chess players, master criminals” – Malcolm Gladwell, Outliers: The Story of Success.

Very inspirational book, highly recommended.

The author observed that, no one can become a world-class master in anything, without putting in about 10,000 hours of practice, in any field, and perhaps more importantly, no one can NOT become a world-class master, after putting in 10,000 hours of practice.

Geniuses don’t exist. They are merely a combination of opportunity and 10,000 hours of practice. They are self-fulfilling prophecies.

Makes sense if you think about it – you randomly pick a kid, tell him he has great potential to become the greatest hockey player ever, and put him through 10 years of 10 hours practice a day. In the end, you will get the greatest hockey player ever. Your prediction was correct.

Bill Gates was a high school kid at a time where only universities had computers. He went to a high school for rich people, with rich parents that decided to buy them a computer (opportunity). He programmed day and night for 10 years. 10,000 hours later, he became a world class programmer (at that time).

Mozart famously started composing at six, but he did not produce masterpieces until he was 20, by which time he has accumulated about 10,000 hours of practice.

Bobby Fischer, famous ex-World Chess Champion, also spent about 10 years of intense practice before becoming a grandmaster.

All NHL players were born in the right months, arbitrarily selected by their birthday when they were 8-9 years old (*), and put through about 10 years of intense practice.

There is no field in which anyone can become a world-class expert with less practice, or not become one with more practice.

Do I believe it? I don’t know. I’ll give it a try and let you know.

* NHL, and most other professional sport leagues, select players by their birthday. Funny? I thought so, too. And it’s true.

If you look at the birthday of NHL players, there are overwhelming number of players born in Jan, Feb, and March. Very few in the later months, October, November, December. Is it because people born in earlier months are more talented? Of course not.

To become a NHL player, one must be selected at an age of 8-9, in a tryout, for junior league. They don’t want to miss any “talent”, so it has to be done at a young age. Everyone are at the same age. That sounds fair? It does, until you realize that, people who are born on December 31 need to compete with people born on January 1, almost a whole year older. At an age of 8-9, kids grow A LOT in a year’s time. 9 year olds are much bigger and more coordinated than 8 year olds. Then the selected ones go through much more intense training, and self-fulfills their prophecy of being the most talented ones.

Someone reportedly went to talk to NHL about this. They agree. And they said they are not going to fix it because it’s “too complicated” to have to hold different tryouts for different month groups. They rather lose about half the talent.

So if you ever want your kids to become a NHL player, try to conceive in March, to give birth in January. If you accidentally give birth too early in December, might as well just tell him to pick up a new hobby. Painting or something, since he probably won’t get into any professional sport, because most of them also have January cutoff.

Posted on October 21st, 2011 by matthew – 3 Comments

RIP Steve Jobs

(This post is not pre-written, hence the few hours delay)

RIP Steve Jobs.

As an engineer, I am a big fan of ingenuity, and Steve has done a lot of that in his life.

I will not bore you with a list of things he did, since you probably have read that a thousand times from different places already, and the list is way too long.

I have a lot of respect for people that are original, and are willing to take risks to move the world forward. Steve is a man that deserves recognition.

Aside – I have always been an anti-fan of Apple for it’s shady business practices, and that has not changed. Only a job offer from Apple would change that. If I ever stop badmouthing Apple, that’s probably what happened. This is a completely different issue.

Now I am really curious about what will happen to Apple after their source of innovation is gone. iSheep will still be iSheep, but I’m sure there are also Apple fans that will leave once Apple becomes just any other tech company. I do not believe anyone can bring Apple back to the former golden Steve-era (early to mid 2000s, when Apple changed the world in so many ways). With the releases of recent Apple products, it’s not hard to notice that Steve’s absence/illness has already stopped innovation.

Can another person like Steve come out to push the tech world forward?

Posted on October 5th, 2011 by matthew – 5 Comments

Biological Fish Tank Algae Control

One of the biggest problems all fish keepers need to deal with at some point is algae.

If you just set up a tank with fishes and leave it alone, it’s almost guaranteed that within 2 weeks, it will look like this -

All the glass surfaces will be covered by algae. All the gravel and decorations are covered by algae, and the water may be green due to microscopic algae.

Not exactly pretty eh? Truth is, fish keeping is not as easy as it appears to be. Fishes can be as high maintenance as other pets, approaching the level of a significant other.

It can be due to a million different things, such as nutrient imbalance, inappropriate lighting (intensity, duration, and spectrum), and decomposing waste.

This post is about how you can carefully set up the fish tank to manage its own algae problem, biologically. There are chemical solutions, but I don’t like having to regularly dump harmful chemicals into my tank. You can also clean the tank manually twice a month, but that’s annoying, and some types of algae are very hard to get rid of.

The biological approach consists of plants, algae-eating fish, and snails.

1. Plants
Plants are very important. I don’t think it matters what kind of plants, but having plants in the tank introduces competition for the nutrients both plants and algae need. It should be noted though, that plants will only use nutrients when they are photosynthesizing. For effective photosynthesis, there should be good amount of light, and more importantly, in the right spectrum. 6500K CFL works well for me. Most light bulbs sold at hardware stores will be too yellow, and will only promote algae growth, since algae are a lot less picky about colour temperature. Second thing is carbon. Everyone has a different theory about how carbon should be added, if at all. For me, Seachem Flourish Excel works well. It’s actually not carbon, but a complex photosynthesis intermediate that reportedly cannot be utilized by algae.

Nitrate is probably the most important nutrient to keep in check. It’s generated by decomposing fish waste (ammonia -> nitrite -> nitrate). While it doesn’t do anything to fishes (except at very high concentration), it’s an important nutrient for plants. Without plants to consume them, nitrate concentration will just keep going up, and an algae boom will result. Therefore, it’s crucial to change water regularly to remove nitrate if the tank is not planted. I find that in my planted tank, nitrate level never goes up.

2. Algae-Eating Fish
Some fishes will eat algae, which is great, but care must be taken to ensure that they are compatible with the water conditions (hardness, pH, and temperature), as well as other fishes in the tank. Below are some algae eaters that are compatible with most tanks, and peaceful towards other fishes.

Otocinclus catfish solves most of the problem.

They are pure herbivores, so they won’t eat most of the flake food you feed other fishes with. They are also somewhat delicate, and require high water quality (proper filtering, good pH, no ammonia, etc). They also like to rest on leaves, so I wouldn’t put them in an unplanted tank. If there’s not enough food for them, I recommend feeding them sliced zucchini. They (and most other fishes) love that stuff. My tank of about 15 fish can finish a 3mm slice in about 2 days.

They eat almost all kinds of algae. One notable exception is hair algae. They don’t touch that stuff.

You may be thinking that’s fine. There will just be a little bit of hair algae left. Unfortunately that’s not how it works. Because otos only eat other types of algae, they are applying selective evolutionary pressure to the algae population. Darwin will eventually kick in, and replace all your algae with hair algae. This is essentially how evolution works, just much accelerated. I witnessed this first hand.

So now we need a fish to eat hair algae. Siamese algae eater (SAE) is a popular choice.

They eat all kinds of algae, but especially hair/thread algae. Fairly hardy fish that eat just about everything.

Plecos are another popular choice, but they are only suitable for much bigger tanks.

Black mollies also reportedly eat hair algae, though not as much as SAE.

3. Snails
Just regular aquarium snails that seem to come with all plants. Some people find them unsightly, but they do clean the glass surfaces pretty well. They require no care at all.

If snail population booms, they can be removed with some lettuce.

This is my tank right now.
IMG_20110927_231414

Almost no algae at all (only a little bit on the filter). This is about a month after I implemented this system. No maintenance required so far.

This is a 20 gallon tank with 5 random plants, 4 guppies, 5 neon tetras, 2 otocinclus catfish, and 2 Siamese algae eater. High water quality (a lot of guppy reproduction going on). 25C, pH 7.0, moderately hard. 2×15W 6500K CFL lighting. Fishes look happy, and plants are thriving (although the red plant is turning green, suggesting slow growth).

Happy fish keeping!

Posted on September 27th, 2011 by matthew – 3 Comments

Kids and Plastic Surgery?!

An awesome psychologist friend of mine just told me about this in our little chat the other day, and I thought it was very interesting -

In North America, in the past year, the number of plastic surgery patients between ages 3 to 16 have increased by 30% (sorry can’t find reference. Not verified. I could’ve remembered wrong, she could have remembered wrong, etc). But yes! 3-year-olds getting plastic surgery! They are only babies!

My first thought was disgust, but after some more discussion, I’m swayed and am now undecided.

The biggest reason they say is against bullying. It’s a fact that kids with unusual facial features (most commonly big nose, big ears) get bullied more.

They aren’t usually the ones that request the surgery, though. It’s usually the parents, most of them with childhood bullying memory, and don’t want the same to happen to their kids.

Plastic surgery should not be the solution to the bully problem. That’s sending very wrong messages to the kids -
1) You are ugly. Too ugly for the society.
2) If someone laughs at you or bullies you, it’s your fault, and you should go fix it.

Instead of teaching kids how to deal with bullying by standing up for themselves and be confident, we are telling them to submit to bullying, and change themselves.

On the other hand, kids that undergo surgery do become a lot more confident, more successful, and happier in general. For this reason, the government will even pay for plastic surgeries for children.

Placebo effect definitely plays a part as well. If they think they are prettier, they will be more confident, and therefore more attractive and sociable. Maybe it has nothing to do with physical changes on the face at all.

What do you think? A morally wrong solution to make happy kids?

By the way, if you haven’t seen this -

Magic!

Posted on September 15th, 2011 by matthew – Comments Off

Primitive Multi-Tasking Using Switch Statement

In all things that must be done, there’s always the right way, and then there is my way.

Today at work, I suddenly realized that my embedded application has to be multi-threaded, because there are a few tasks with strange timing requirements (that’s not the topic of this post).

This is on an embedded system with no OS, so no scheduler, setjmp/longjmp, etc.

I could install a proper RTOS (eg. FreeRTOS) to get a real scheduler, and rewrite my whole program to be multi-threaded. But it’s a big program, and it has to work by the end of the week.

So reality happened – I decided to retrofit cooperative multi-tasking into the program, using what I think is a novel approach I just invented – switch statements!

This is what the code looked like. Just regular update loop.

void UpdateSubsystem1()
{
	...
}

void UpdateSubsystem2()
{
	...
}

void UpdateSubsystem3()
{
	...
}

void Update()
{
	UpdateSubsystem1();
	UpdateSubsystem2();
	UpdateSubsystem3();
}

Some of the subsystem update functions take a long time, because, for example, they send things over communication buses.

I can rewrite everything as state machines, but time is of the essence… So here is what I did to each subsystem update function.

First, I enclosed the whole body of the function in a switch statement, and inserted case labels at my “restart points” (points where I want to be able to “yield”). I also added a static variable to hold which stage the function/task is in.

So if my original code looked like this -

void UpdateSubsystem()
{
	doSomething();

	while (!SomethingHappened()) {}

	doSomething();

	while (!SomethingHappened()) {}

	doSomething();
}

Now it looks like this -

void UpdateSubsystem()
{
	static int stage = 0;

	switch(stage)
	{
	case 0:
		doSomething();

	case 1:
		while (!SomethingHappened()) {}

		doSomething();

	case 2:
		while (!SomethingHappened()) {}

		doSomething();
	}
}

Note that there are no breaks. This is intentional. We want the program to fall through everything.

Then, when I want the task to yield, I just have to set “stage” to the restart point, and simply return!

void UpdateSubsystem()
{
	static int stage = 0;

	switch(stage)
	{
	case 0:
		doSomething();

	case 1:
		if (!SomethingHappened())
		{
			stage = 1;
			return;
		}

		doSomething();

	case 2:
		if (!SomethingHappened())
		{
			stage = 2;
			return;
		}

		doSomething();

		stage = 0;
	}
}

Cool eh?

A few caveats -
1. Most importantly, all variables must be static/global. Local variables can only be used between 2 restart points, because the stack frame for the call is destroyed when the function returns.
2. Cannot start the same task more than once. Function is no longer re-entrant due to the use of static variables.

Probably a few more I missed.

This is not patented already, right?

Update:
Macro implementation:

#define RESTARTABLE_BEGIN static int restartable_stage = 0; switch(restartable_stage) { case 0:
#define RESTARTABLE_YIELD restartable_stage = __LINE__; return false; case __LINE__:
#define RESTARTABLE_END } restartable_stage = 0; return true;

Example:

bool task1()
{
	RESTARTABLE_BEGIN;

	std::cout << "t1 -> 1" << std::endl;

	RESTARTABLE_YIELD;

	std::cout << "t1 -> 2" << std::endl;

	RESTARTABLE_YIELD;

	std::cout << "t1 -> 3" << std::endl;

	RESTARTABLE_END;
}

bool task2()
{
	RESTARTABLE_BEGIN;

	std::cout << "t2 -> 1" << std::endl;

	RESTARTABLE_YIELD;

	std::cout << "t2 -> 2" << std::endl;

	RESTARTABLE_YIELD;

	std::cout << "t2 -> 3" << std::endl;

	RESTARTABLE_YIELD;

	std::cout << "t2 -> 4" << std::endl;

	RESTARTABLE_YIELD;

	std::cout << "t2 -> 5" << std::endl;

	RESTARTABLE_END;
}

int main(int argc, char* argv[])
{
	bool task1_done = false;
	bool task2_done = false;

	while (!(task1_done && task2_done))
	{
		if (!task1_done)
		{
			task1_done = task1();
		}

		if (!task2_done)
		{
			task2_done = task2();
		}
	}

	return 0;
}

This program actually triggers a bug in VS2010. It doesn't like __LINE__ as a case label (saying it's not constant) if edit and continue debugging is turned on.

Workaround provided by putty (http://rc.quest.com/viewvc/putty/branches/group-policy/putty/ssh.c?view=markup) -

337 * In particular, if you are getting `case expression not constant'
338 * errors when building with MS Visual Studio, this is because MS's
339 * Edit and Continue debugging feature causes their compiler to
340 * violate ANSI C. To disable Edit and Continue debugging:
341 *
342 * - right-click ssh.c in the FileView
343 * - click Settings
344 * - select the C/C++ tab and the General category
345 * - under `Debug info:', select anything _other_ than `Program
346 * Database for Edit and Continue'.
347 */

Posted on August 17th, 2011 by matthew – 5 Comments