Today, I'm releasing something that I've wanted to release for a very long time. It's a project that I worked on during my Ph.D., and while I don't think it'll be terribly useful to anyone, a lot of work went into it that I want to preserve, even if just for myself.
That project is Jumbo, and it's now availabe on GitHub in two flavors: Jumbo for .Net 6+, and the original for .Net Framework and Mono. If you want to play around with it or learn more about it, you probably want the former.
Jumbo is an experimental large-scale distributed data processing system, inspired by MapReduce and in particular Hadoop 1.0. Jumbo was created as a way for me to learn about these systems, and should be treated as such. It's not production quality code, and you probably shouldn't entrust important data to it.
Basically, back when I was getting started with my Ph.D. in 2008, I found myself staring at the code of Hadoop (which wasn't even at version 1.0 yet at the time), and finding I wasn't really getting a good feel of how the whole thing fit together, and what really goes into designing a system like that.
So, some people at my lab suggested I should try building something for myself, which I did. I built, from the ground up, a distributed file system and data processing system, which is Jumbo. It was heavily inspired by Hadoop, and definitely borrows from its design (although no actual code was borrowed). In some aspects, I deviate from Hadoop quite a lot (especially since Jumbo isn't constrained to only using MapReduce).
Building Jumbo taught me a lot: about software design, about distributed processing, about decisions that affect scalability, and more. It's my hope that maybe, someone else interested in these topics might want to look at it and find what I did interesting. If nothing else, I just want to preserve this massive project that I did (still the biggest project I've done where I'm the sole contributor), and have its history available.
I did end up using Jumbo for some research efforts, which you can read about in a few papers as well as my dissertation under the University section of my site.
Jumbo is also the origin of one of my most widely used libraries, Ookii.CommandLine, so it's significant in that respect as well.
Like I said, I've wanted to release Jumbo for a long time. If you look through the original project's commit history you can see a bunch of work done in early 2013 (as I was nearing the end of my Ph.D.) like cleaning stuff up and adding documentation, but I never quite reached a level where I was comfortable doing so. The project, which primarily targeted Mono to run on Linux, wasn't that easy to set up and run.
In 2019, I ported the project to .Net Core, just to see if I could. That version was easier to play around with, and I wanted to release it then too, but I never quite got around to finishing it, until now.
So now, you can look at Jumbo and play around with it on .Net 6+, thanks to this new version. I've also expanded the documentation significantly, so it should be easy to get started and to learn more about how it works. The original Jumbo project for Mono and .Net Framework is only provided to preserve the original history of the project (the new repository only contains the history of the port). You probably shouldn't try and run it (though I obviously can't stop you).
If you want to comment on Jumbo or ask any questions, please use the discussions page on GitHub.
Every semester, the Graduate School of Information and Communication Engineering at the University of Tokyo holds a special seminar (in Japanese: 輪講 rinko) where the various students give a presentation on a topic that will likely be related to their research (although people switching subject entirely afterwards is not unheard of :P ). Basically you do a survey of some recent papers in your field. Since I'm a new student, I had to give a presentation as well (in the future I will need to attend this seminar but I won't need to do another presentation myself).
This presentation is taken rather seriously. Although it is only 25 minutes long, everybody makes an aweful fuss about it. I don't think there was this much to do about even my final presentation for my Master's degree. :)
They even made me do two rehearsals beforehand with some members from my lab. Although I'm not particularly fond of doing that, one positive result was that this is the first presentation in years where I didn't go over time. I had 25 minutes, and I did it in 23.
Now I've passed this particular hurdle, I should really start looking into what exactly I want to do for my research (I have a general idea, but nothing fixed yet). Not that I won't still be busy in the mean time: next week I've got another paper to do for the Web Engineering lecture, and I also have a presentation coming up for Distributed Systems. Neither of which should be anywhere near as time-consuming as the Rinko stuff, though.
On top of which, I seem to have cought a cold somewhere. :( It doesn't look like it's going to be too bad though.
One notable difference between doing a PhD here and doing it in the Netherlands is that here, there's a requirement to get some credits from taking lectures. I am required to get 8 credits, which by whatever system they're using (I'm not sure what it is, it's definitely different from ECTS though) comes down to four courses.
This isn't a problem of course. I have registered for four courses this semester so if I get credits for them all I'll be done with it. One factor that makes this more interesting is that most lectures are given in Japanese. Only one out of the four lectures I'm taking is in English. So can I understand those lectures in Japanese? Not really. One of them is okay because he has slides that are in English, but the other two are not so easy. But it doesn't really matter, I'll just have to attend the lectures and submit a paper (which I can write in English) and I'll get the credits. Sure, there are more useful ways I could spend my time, but it's a requirement and there's nothing I can do about so there's no point in complaining. I've got my laptop and wireless Internet so I can just work on something else during the lectures if I can't understand it.
Besides those four courses I also have to participate in a special seminar for students from my department. Here Master and PhD students give a presentation, usually a survey of recent research in whatever field they're also working in. Of course most of those are also in Japanese. I'll have to give a presentation myself as well of course, that one will be in English.
So with those four lectures, the seminar and Japanese classes three times a week (I'm taking those again as well), my schedule is looking pretty full (and it's all at Hongo so I have to spend some time travelling too, about an hour in each direction). And wasn't I supposed to do research too? :P
At least I won't be bored. :)
One of the nicer consequences of me passing the exams is that my research lab at University would provide with a new laptop. They'd already given me an old Panasonic to get me through the intervening months, but now that I'm a real student at the lab, they can afford to get me new stuff.
Based on Internet reviews I chose the Dell XPS M1330. Today, it arrived.
The pros: unlike your regular Dell laptop, this one looks extremely cool. The 13.3" widescreen WLED screen is fantastic. I like the keyboard so far (it's not gotten any real workout though). It has a fingerprint reader which is just plain cool (when Windows shows the welcome screen I just scan my finger and it logs on). Vista runs great on it, it's very fast, but with a Core 2 Duo T7500 (2.2GHz) and 2GB RAM, what do you expect? I also specifically picked the optional 7200RPM hard drive because I hate the usual slow laptop hard drives.
The cons: it's heavier than I expected. Maybe I've gotten spoiled by the Panasonic which weighs nothing. It's not super heavy but it could've been better. The edges around the screen are rather wide, without those the laptop could've been much smaller. It has only two USB ports. Dell installed the usual amount of crap on it that I had to take some time to remove.
One interesting thing is that Dell Japan didn't offer an option for an English version of Windows (especially weird since they did offer an English keyboard layout). But they did offer an option to get Vista Ultimate for a relatively small amount of money (which I didn't have to pay anyway :P ). So I got that and then used the English language pack available through Vista Ultimate Extras to turn the UI to English, which works great.
So overall I'm very pleased. But then again, since I've just gotten a great laptop for free, why wouldn't I be. :)
Last post I said I would get the official exam results on September 7th. That's still true.
But my professor called me today to tell me unofficially that I have passed the exam. I am now, for real, a PhD student at Tokyo University!
Goukaku shimashita! Ureshii desu! (I passed! I'm happy!)
Officially I start October 1st, unofficially I can start pretty much right now. Of course the big first task is going to be finding a research topic.
Now let's hope life as a PhD student isn't too much like it's depicted in PHD Comics. :P