Thursday, January 2nd

Scaling Pinterest

This is an infoq video where Marty Weiner and Yash Nelapati talk about decisions they took during their journey from the beginning up to now, I’ve found it very interesting because they highlighted some very concepts having real and relevant impacts despite their triviality.

Apache Mesos

Watching another video at infoq, Apache mesos have been mentioned and then let’s tale a quick look. As from the site, Apache Mesos is a cluster manager that provides efficient resource isolation and sharing across distributed applications, or frameworks. It can run Hadoop, MPI, Hypertable, Spark, and other applications on a dynamically shared pool of nodes. It is a distributed computing platform or we could think of it as a sort of distributed OS. It implements a Master/Slave architecture and has the following components:

  • Master(s):one master is elected (Zookeepr cluster) among available masters. Master doesn’t do much, it mainly manages resources (CPU, memory, …), launches tasks on slaves, forwards status messages between tasks and framework
  • Slave(s): it monitors individual tasks and reports status to the master, ensures that tasks don’t exceed resource limits. It executes tasks submitted by frameworks.
  • Framework(s): it is for instance your application, it receives resource offers from master and launches tasks.

Example of frameworks are:

Pyres – a Resque clone

Resque is a great implementation of a job queue by the people at github, unfortunately 😛 it’s written in Ruby and someone who works in python ported the code to python creating PyRes.You can put jobs (that can be any kind of class) on a queue and process them while watching the progress via your browser.

Pandas cookbook

Pandas is a python library for doing data analysis, it is dast and lets you do exploratory work really quicky. This is a cookbook that gives you some concrete examples for getting started with pandas.

Some datasets

Here you can find a list of available dataset for download, I hope they can be useful 😛

Docker.io

Docker is an open source project to pack, ship and run any application as a lightweight container. Some people in my office pointed me to this project and it seems quite interesting. Let’s start trying to better understand what it really is.

This is a short description from the site

Docker containers are both hardware-agnostic and platform-agnostic. This means that they can run anywhere, from your laptop to the largest EC2 compute instance and everything in between – and they don’t require that you use a particular language, framework or packaging system. That makes them great building blocks for deploying and scaling web apps, databases and backend services without depending on a particular stack or provider.

Typically you can distribute applications and sandbox their execution using a virtual machines, for instance VMWare, Oracle VirtualBox and Amazon EC2 ami. Using this solution a developer should be allowed to package its application and distribute / depoly it with little effort. In practice it does not happen mainly for these reasons:

  • Size: they may be very large and thus difficult to store and transfer
  • Performance
  • Portability: one VM instance does not play very well with competitor solutions
  • HW-centric

By contrast, Docker relies on a different sandboxing method known as containerization. Unlike traditional virtualization, containerization takes place at the kernel level.
Docker builds on top of these low-level primitives to offer developers a portable format and runtime environment that solves all 4 problems.
Docker containers are small (and their transfer can be optimized with layers), they have basically zero memory and cpu overhead, they are completely portable and are designed from the ground up with an application-centric design. In addition because docker operates at the OS level, it can still be run inside a VM!

JavaScript Patterns Collection

A JavaScript pattern and antipattern collection that covers function patterns, jQuery patterns, jQuery plugin patterns, design patterns, general patterns, literals and constructor patterns, object creation patterns, code reuse patterns, DOM and browser patterns.

Posted in Dataset, Python, Uncategorized

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: