Finding a Majority Element in Python 3

Doing more data structure work; asked to find if an array of numbers has a majority element – that is, an element that appears more than 50% of the time in the array.

My assignment is to write a greedy algorithm to do this…but wouldn't it be easier to just use the collections module?

from collections import Counter

def majority_element_finder(numbers):
    return Counter(numbers).most_common()[0][1] > (len(numbers) // 2)

The most_common([n]) returns a list of tuples of the elements in numbers along with the number of times each appears. This is listed in decreasing order. We want to take the first element's ([0]) count ('[1]'), and see if it's greater than half the length of the list.

Thoughts after my first codjo

As part of RC Start, the Recurse Center's mentoring program for new(-ish) programmers, I've started taking part in their "codjo" pair programming challenges. The way it works is that you are given a partner and a puzzle to work on.

A few thoughts about it:

Technology gets in the way

Biggest hurdle in getting it done? Figuring out our technology stack for screen sharing. ScreenHero wouldn't let my partner register, even though she had a valid invite. Then it took a while to get Google Hangouts audio working. We spent way too much time getting technology to work…but that seems to be quite a bit of doing development, too.

Remote pair programming forces you to think clearly

We used a driver/navigator system. My partner navigated (said what to do) and I drove (typed the code).

It was important to choose good variable names and write code that we both could understand. Talking things out and getting and giving immediate feedback on how code read felt transformative, in a way. In the past I'd been fearful of going over my code with people (I'd had some bad experiences), but getting to do it in a collaborative, non-judgmental setting was fantastic.

It's great to have all your tools together

My dev environment - Atom, terminal-plus, Neon Color Scheme

While working on the puzzle I was using Atom along with the terminal-plus plugin. This meant my partner and I could run tests and try out code snippets in IPython without having to leave the environment. Super useful. Of course, it also could be super embarrassing, like when I realized my machine's name was out there for the world to see. (It's from The Onion)

Group code reviews are scary but fun

After we finished and submitted our solution there was a group code review. Three groups shared and explained their code. Seeing how other people approached the problem was very helpful – something I love about coding is that you're always able to learn from the code you read. Having the authors explain their thinking and then getting a review of the code from an experienced developer made it even easier to learn.

In summation

Do pair programming. It's worth it.

Rounding decimals in Python (or: Why doesn't .5 round to 1?)

Doing my morning Hacker Rank puzzle this morning, I discovered Python 3 doesn't round numbers ending in .5 the way I'd expect:

>>> 12
>>> 14
>>> 12
>>> 14

As always, Stack Overflow had the answer: Python rounds .5 down sometimes because of Banker's Rounding, also known by the much more informative name "Round Half To Even". Python will round .5 numbers to the nearest even whole.

In the problem I was solving (giving a rounded total cost of a meal), this didn't work, so I had to use decimal.Decimal's quantize method to round up:

from decimal import Decimal

mc = float(input())  # these unpythonic variable names came from HackerRank
tp = int(input())
tax = int(input())

def totalCost(mc, tp, tax):
    total = Decimal(mc + (tp / 100 * float(mc)) + ((tax / 100 * float(mc)))).quantize(Decimal('1'))
    print("The total meal cost is {} dollars.".format(total))

totalCost(mc, tp, tax)

I'm over 80 characters in that line, so there's the potential for refactoring.

The difference between list.sort and sorted(list) in Python 3

The other week for my algorithms class I needed to find the minimum dot product of a list. My first try at a solution, which threw an error, was this:

def min_dot_product(a, b, n):
    a = sorted(a)
    b = sorted(b, reverse=True)
    res = 0
    for i in range(n):
        res += a[i] * b[i]
    return res

Python has two different ways of sorting a list: list.sort([reverse=False]) and sorted(list[, reverse=False]). Why a method and a function to do the same thing? Let's look:

li = [5, 1, 4, 3, 2]
>>> [1, 2, 3, 4, 5]
>>> [5, 4, 3, 2, 1]

list.sort() sorts the list and returns None.

On the other hand, sorted(list) sorts and returns the list

li = [5, 1, 4, 3, 2]
sorted(li, reverse=True)
>>> [5, 4, 3, 2, 1]

I've contributed the above to SO's Python documentation on lists and list methods.

Fun with Python 3's print function

Fun with the print function

Most Pythonistas learn early on that the print function can evaluate statements:

>>> False
>>> True

However, I didn't know that print can return different strings based upon whether or not the following statement is true or false. The format is:

print(['output for False', 'output for True']['statement_to_evaluate'])

For example:

print(['no', 'yes'][2 > 3])
>>> no

print(['Go back to math class', 'You know math!'][7 + 5 == 12])
>>> You know math!

Greedy Algorithms: Making Change

In this week's installment of my Coursera class on Algorithms, we've been introduced to "greedy" algorithms. One of the homework requires us to solve a tried and tested greedy problem: determining the least number of coins needed to change money.

My first answer to this question wasn't actually a greedy algorithm…but it was only one line:

def get_change(num):
    return (num % 5 + ((num - (num % 5)) % 10 // 5) + ((num - (num % 10)) // 10))

Which adds up the constituent ones, fives, and tens of the number. Here's the code on PythonTutor. While this code passed all the tests, it is neither readable nor greedy. So, back to the drawing board.

Next, I wrote a function that took two variables: the amount (num) and number of coins (coins):

def get_change(m, coins):

    while m:
        if m - 10 >= 0:
            m -= 10
            coins += 1
            return get_change(m, coins)
        if m - 5 >= 0:
            m -= 5
            coins += 1
            return get_change(m, coins)
        if m - 1 >= 0:
            m -= 1
            coins += 1
            return get_change(m, coins)

    return coins

Ho there, recursion! The program keeps calling itself, stating the amount of money (m) remaining and the number of coins each time. This also wound up being about twice as fast as my first version of the code: in the online grader, the first version of the code took .04 seconds, while the second iteration took just .02 seconds. Pretty good! Here's the code on PythonTutor; it takes 26 steps to find how many coins are needed to change six cents.

That's fine, but I wanted to see if I could do it without that second variable. And whaddayaknow, I could.

def get_change(m):

    coins = 0

    while m:
        while m - 10 >= 0:
            m -= 10
            coins += 1
        while m - 5 >= 0:
            m -= 5
            coins += 1
        while m - 1 >= 0:
            m -= 1
            coins += 1

    return coins

That just seems much cleaner and more readable, doesn't it? Looking at it on PythonTutor, it only takes 17 steps to solve get_change(6) – much more efficient than my second attempt! In the online grader this third iteration ran slightly more efficiently, using 4 MB less memory than the second iteration taking the same amount of time.

What I Would Do Next

Next steps for the above? If I were to come back to this assignment, I'd work to make the code reusable. Right now it is built to work with pennies, nickels, and dimes. It would be an idea to allow the user to enter an amount and a list of coin values that could be used to determine how many coins are needed to make change.

Chained geography lookups in Django with django-smart-selects and django-cities-light

Over the past months I've been working in my spare time on a Django based website, allTEFL, a sort of Glassdoor site for TEFL teachers. This was the project that first made me want to learn to program, and it is hugely exciting to see it come to life.

This is the first in a series of posts in which I'll describe how I designed and organized the site, then talk about some of the fun coding challenges I came across as I worked on it.

First we'll walk through some simple user stories, then talk about how use case scenarios affected how I created my models.

User Stories

At first, I figured that there would basically be three things that would happen:

  1. A Teacher creates an account so she can review a school (school_review) or a recruiter (recruiter_review).
  2. A Teacher adds a school_review to the database. This is publicly viewable as an anonymous post on the School's page.
  3. Specific data from the school_review (school name, dates of employment) is visible in the Teacher's resume_view.

Putting the Parts Together

Creating a Teacher model was easy; use Django's auth.User model. The School model has proven to be much more interesting.

The School

A Teacher has a one-to-many relationship with Schools that she's taught at and reviewed.

First, a complication: there is no complete list of all the English schools or TEFL recruiters in China. Thus, (2) above has a problem: the Teacher's School may not exist in the database. So let's revise my user stories:

  1. A Teacher adds a school_review for a School in the database.
  2. If the School is not in the database already, the Teacher adds it before reviewing it.

Cool. But as it turns out, doing this is a bit harder than it seemed at first glance.

Letting a user add Schools to the database is problematic because we don't want data to be fragmented. (One teacher might enter Beijing High School, but another teacher from the same school might look for BHS and then create a new instance of the School model.) So it's important to make finding your school's name as streamlined and precise as possible, reducing peoples' temptation to not look and just add a new school.

We could just add schools by name, but this is problematic because Schools may well share names. For example, there are a number of chain schools, similar to test prep behemoths Kaplan or Princeton Review in the US. We want to make sure Wall Street English in Wudaokou, Beijing is distinguishable from WSE in Zhongguancun, Beijing. Also, users will doubtless want to look up schools by location So, just having a list of schools organized by name is not an appropriate way to organize the db.

Chained Selects in Django

So, how about a chained lookup in our form. A chained lookup means finding all Y that belong to X. This will allow us to select schools (and enter them into the database) based on location. The user can select a Province, which brings up a list of all its City. After selecting the City, you can either (a) select your School or (b) add_school to the database if necessary.

Country -> Province List -> City List -> Schools
China -> Jilin -> Changchun -> Sino-American Denver Foreign Language School

But to do this, I need a list of all the provinces (and their constituent cities) in a given country. That sounds like quite a bit of work to do by hand…so surely someone's automated it, right?

There are two Django packages that can be combined to (a) populate tables with country, province/state, and city data, and (b) make it easy to build geographic chained selections: django-cities-light and django-smart-selects. Below I will show you how to use the two together to create your own forms in Django that automatically whittle down information from the database to

For my use case it's been necessary to make custom models so I can have views for city, region, and city. Because of this, I need to use the abstract models included in cities_light.

from cities_light.abstract_models import AbstractCountry, AbstractRegion, AbstractCity

These are our necessary imports. First, we are using the generic models from django-cities-light: Country, Region, and City.

IMPORTANT There is a known bug in django-cities-light's abstract models that requires you to declare your models in the same order that you imported the Abstract models.

from cities_light.abstract_models import AbstractCountry, AbstractRegion, AbstractCity
from cities_light.receivers import connect_default_signals

class Country(AbstractCountry):


class Region(AbstractRegion):


class City(AbstractCity):


We need to register our custom models in the site's


CITIES_LIGHT_TRANSLATION_LANGUAGES = ['zh', 'en'] # I want to include both Chinese & English names
CITIES_LIGHT_INCLUDE_COUNTRIES = ['CN'] # Download geographic information for China.
CITIES_LIGHT_APP_NAME = 'app_name' # Let the app know that I have custom models, defined in this app.

From the terminal we now need to run ./ cities_light to populate the tables. Three tables will be created: country, region, and city. The tables have a bunch of columns in them. The most important are:


id name_ascii slug
1 China china


Header One Header Two
Item One Item Two
id name_ascii slug display_name country_id
2 Qinghai qinghai Qinghai, China Foreign Key

Worth noting that this model has a foreign key, country_id, that associates with a country.


id name_ascii slug display_name country_id region_id
1 Rikaze rikaze Rikaze, Tibet Autonomous Region, China ForeignKey(country_id) ForeignKey(region_id)

Two foreign keys here - country_id and region_id.

The tables include multiple versions of the location's name. I've found name_ascii and display_name to be the two most useful. THe latter includes province and country.

Linking to City/Province/Country views

Our existing models are nice, but I want to be able to link to views for each of these models. This means writing a get_absolute_url function for them in my models, adding URL patterns to, and adding views for each.

    class Country(AbstractCountry):

        def get_absolute_url(self):
            return reverse('country_school_list',
                           kwargs={'country_slug': self.slug})


    class Region(AbstractRegion):

        def get_absolute_url(self):
            return reverse('province_school_list',
                                   'province_slug': self.slug})


    class City(AbstractCity):

        def get_absolute_url(self):
            return reverse('city_school_list',
                                   'province_slug': self.region.slug,
                                   'city_slug': self.slug})





    from django.views.generic import ListView, DetailView, CreateView
    from .models import School, Country, Region, City

    class CountrySchoolList(ListView):

        model = School
        template_name = "school_reviews/school_list.html"
        ordering = ('school_province', 'school_city', 'school_name')

        def get_queryset(self, **kwargs):
   = get_object_or_404(Country, slug=self.kwargs['country_slug'])
            return School.objects.filter(

        def get_context_data(self, **kwargs):
            context = super(CountrySchoolList, self).get_context_data(**kwargs)
            context['section'] = 'country'
            return context

    class ProvinceSchoolList(ListView):

        model = School
        template_name = "school_reviews/school_list.html"
        ordering = ('school_city', 'school_name')

        def get_queryset(self, **kwargs):
   = get_object_or_404(Country, slug=self.kwargs['country_slug'])
            self.province = get_object_or_404(Region, slug=self.kwargs['province_slug'])
            return School.objects.filter(,

        def get_context_data(self, **kwargs):
            context = super(ProvinceSchoolList, self).get_context_data(**kwargs)
            context['section'] = 'province'
            return context

    class CitySchoolList(ListView):

        model = School
        template_name = "school_reviews/school_list.html"
        ordering = ('school_name')

        def places(self, **kwargs):
            country = get_object_or_404(Country, slug=self.kwargs['country_slug'])
            province = get_object_or_404(Region, slug=self.kwargs['province_slug'])
            city = get_object_or_404(City, slug=self.kwargs['city_slug'])
            return country, province, city

        def get_queryset(self, **kwargs):
   = get_object_or_404(Country, slug=self.kwargs['country_slug'])
            self.province = get_object_or_404(Region, slug=self.kwargs['province_slug'])
   = get_object_or_404(City, slug=self.kwargs['city_slug'])
            return School.objects.filter(,

        def get_context_data(self, **kwargs):
            context = super(CitySchoolList, self).get_context_data(**kwargs)
            context['section'] = 'city'
            context['country'] = get_object_or_404(Country, slug=self.kwargs['country_slug'])
            return context

Smart Selects

The last thing we need to do is write our models incorporating the ChainedForeignKey that django-smart-selects provides. Keep in mind that the chained_field is the name in your model, while the chained_model_field is the name in your database. My names aren't the same since I used django-cities-light to populate the db.

class School(models.Model):

    school_name = models.CharField(blank=False, max_length=200, unique=True)
    slug = models.SlugField(unique=True)
    school_country = models.ForeignKey(Country)
    school_province = ChainedForeignKey(
        help_text="The school's province."
    school_city = ChainedForeignKey(
        help_text="The school's city."

    def save(self, *args, **kwargs):
        self.slug = slugify(self.school_name)
        super(School, self).save(*args, **kwargs)

    def __str__(self):
        return self.school_name

    class Meta:
        ordering = ('school_country', 'school_province', 'school_city', 'school_name')

And there you have it. A Django model that relies upon chained selections of geographic data that have been auto-populated to your database.