Pages

Wednesday, January 25, 2012

ICPC @ TUM and a Quick State of the Union

Guess what's coming to TUM:
Alright, you didn't actually have to guess.
The ICPC is a yearly programming contest at universities around the world, so naturally I was very excited today to find out that TUM is also participating. Teams can have between two and three members, and I've already recruited Robert for my team, which is definitely something to be excited about aswell, given that he's been coding since he was about 12 years old.
I've been doing a lot of problems on Project Euler lately, in fact I've recently hit 50 solved problems, but competing in a live event will be a very different feeling I expect.
The competition is Saturday in a week (04/02/2012), and will go on for 5 hours as far as I know, during which everyone needs to try and solve as many out of 10 problems as they can. On first glance I was hesitant to spend all Saturday at uni, but then I found out there's gonna be pizza so that was that.

If you're studying at TUM and are looking for a team to join for ICPC, drop me a comment here or send me an email! You can find more info on ICPC here: http://cm.baylor.edu/welcome.icpc and here: http://icpc.in.tum.de/index.php/Main_Page

By the way, Robert is also the guy who's working on the WikiGraph (still unsure of that name) project with me. He's been talking about starting a Blog, so I'll be sure to post a link on here once that happens. Seriously, the guy has skills.
Speaking of Wikigraph, there are some news on what's going on with that as well:

  • We're getting good graphs for networks of up to fifteen thousand or so nodes, but beyond that our computational capacities don't allow for completing graphs in a reasonable amount of time.
  • I've talked to two of my professors, and it looks like we may be able to get time on a 30-40 core cluster to run our program, which might allow us to graph the entire German Wikipedia (1.2 million articles). This would be pretty amazing, but we're definitely going to have to work on properly scaling the algorithm so we don't get buggy as the network gets huge. 
  • I'm also working on a way to read the input from a database (specifically the Wikipedia dumps of all pages and pagelinks, which is available at  http://dumps.wikimedia.org/), but I'll post some technical stuff on that later. The gist of it is that I have to combine the pages dump (320MB) and pagelinks dump (2.6GB) to get a full description of all the connections. I'm still unsure of what the best way to do it is, and I don't want to waste time trying to do it inefficiently when dealing with these huge amounts of data. Finding disconnected subgraphs may also present itself to be a problem.
  • Different datasets are also being worked on, one thing I'm particularly interested in is mapping Darknet/Meshnet, so I'm looking for a way to actually extract the entire set of nodes and their connections. I'm not sure if it's actually possible either, I'll have to look more closely into it.
  • Finally, I used some of the source code from this project to create a simulation of planets/particles interacting through gravity - a planetary system, basically. It's almost more fun to play around with, in fact. Also more on that soon.

1 comment:

my assignment help said...
This comment has been removed by the author.