Dictionaries
a_dict = {'color': 'blue', 'fruit': 'apple', 'pet': 'dog'}
for key in a_dict:
print(key)
## color
## fruit
## pet
for key in a_dict:
print(key, '->', a_dict[key])
## color -> blue
## fruit -> apple
## pet -> dog
The preceding code allowed you to get access to the keys (key) and the values (a_dict[key]) of a_dict at the same time. This way, you can do any operation with both the keys and the values.
Iterating Through .items()
When you’re working with dictionaries, it’s likely that you’ll want to work with both the keys and the values. One of the most useful ways to iterate through a dictionary in Python is by using .items(), which is a method that returns a new view of the dictionary’s items:
a_dict = {'color': 'blue', 'fruit': 'apple', 'pet': 'dog'}
d_items = a_dict.items()
d_items # Here d_items is a view of items
## dict_items([('color', 'blue'), ('fruit', 'apple'), ('pet', 'dog')])
Dictionary views like d_items provide a dynamic view on the dictionary’s entries, which means that when the dictionary changes, the views reflect these changes.
Views can be iterated over to yield their respective data, so you can iterate through a dictionary in Python by using the view object returned by .items():
for item in a_dict.items():
print(item)
## ('color', 'blue')
## ('fruit', 'apple')
## ('pet', 'dog')
The view object returned by .items() yields the key-value pairs one at a time and allows you to iterate through a dictionary in Python, but in such a way that you get access to the keys and values at the same time.
If you take a closer look at the individual items yielded by .items(), you’ll notice that they’re really tuple objects. Let’s take a look:
for item in a_dict.items():
print(item)
print(type(item))
## ('color', 'blue')
## <class 'tuple'>
## ('fruit', 'apple')
## <class 'tuple'>
## ('pet', 'dog')
## <class 'tuple'>
Once you know this, you can use tuple unpacking to iterate through the keys and values of the dictionary you are working with. To achieve this, you just need to unpack the elements of every item into two different variables representing the key and the value:
for key, value in a_dict.items():
print(key, '->', value)
## color -> blue
## fruit -> apple
## pet -> dog
Here, the variables key and value in the header of your for loop do the unpacking. Every time the loop runs, key will store the key, and value will store the value of the item that is been processed. This way, you’ll have more control over the items of the dictionary, and you’ll be able to process the keys and values separately and in a way that is more readable and Pythonic.
Note: Notice that .values() and .keys() return view objects just like .items(), as you’ll see in the next two sections.
You can also use .keys() and .values() methods to return the called items.
It’s worth noting that they also support membership tests (in), which is an important feature if you’re trying to know if a specific element is in a dictionary or not:
a_dict = {'color': 'blue', 'fruit': 'apple', 'pet': 'dog'}
'pet' in a_dict.keys()
## True
'apple' in a_dict.values()
## True
'onion' in a_dict.values()
## False
'color' in a_dict.values()
## False
Modifying values and keys
The values, for example, can be modified whenever you need, but you’ll need to use the original dictionary and the key that maps the value you want to modify:
prices = {'apple': 0.40, 'orange': 0.35, 'banana': 0.25}
for k, v in prices.items():
prices[k] = round(v * 0.9, 2) # Apply a 10% discount
prices
## {'apple': 0.36, 'orange': 0.32, 'banana': 0.23}
Dictionary comprehension
Suppose, for example, that you have two lists of data, and you need to create a new dictionary from them. In this case, you can use Python’s zip(*iterables) to loop over the elements of both lists in pairs:
objects = ['blue', 'apple', 'dog']
categories = ['color', 'fruit', 'pet']
a_dict = {key: value for key, value in zip(categories, objects)}
a_dict
## {'color': 'blue', 'fruit': 'apple', 'pet': 'dog'}
Here, zip() receives two iterables (categories and objects) as arguments and makes an iterator that aggregates elements from each iterable. The tuple objects generated by zip() are then unpacked into key and value, which are finally used to create the new dictionary.
Turning keys into values and vice versa
a_dict = {'one': 1, 'two': 2, 'thee': 3, 'four': 4}
new_dict = {value: key for key, value in a_dict.items()}
new_dict
## {1: 'one', 2: 'two', 3: 'thee', 4: 'four'}
Filtering items
To filter the items in a dictionary with a comprehension, you just need to add an if clause that defines the condition you want to meet.
a_dict = {'one': 1, 'two': 2, 'thee': 3, 'four': 4}
new_dict = {k: v for k, v in a_dict.items() if v <= 2}
new_dict
## {'one': 1, 'two': 2}
Doing some calculations
If you use a list comprehension to iterate through the dictionary’s values, then you’ll get code that is more compact, fast, and Pythonic:
incomes = {'apple': 5600.00, 'orange': 3500.00, 'banana': 5000.00}
total_income = sum([value for value in incomes.values()])
total_income
## 14100.0
The list comprehension created a list object containing the values of incomes, and then you summed up all of them by using sum() and stored the result in total_income.
If you’re working with a really large dictionary, and memory usage is a problem for you, then you can use a generator expression instead of a list comprehension. A generator expression is an expression that returns an iterator. It looks like a list comprehension, but instead of brackets you need to use parentheses to define it:
total_income = sum(value for value in incomes.values())
total_income
## 14100.0
If you change the square brackets for a pair of parentheses (the parentheses of sum() here), you’ll be turning the list comprehension into a generator expression, and your code will be memory efficient, because generator expressions yield elements on demand. Instead of creating and storing the whole list in memory, you’ll only have to store one element at a time.
Finally, there is a simpler way to solve this problem by just using incomes.values() directly as an argument to sum():
total_income = sum(incomes.values())
total_income
## 14100.0
Removing specific items
Now, suppose you have a dictionary and need to create a new one with selected keys removed. Remember how key-view objects are like sets? Well, these similarities go beyond just being collections of hashable and unique objects. Key-view objects also support common set operations. Let’s see how you can take advantage of this to remove specific items in a dictionary:
incomes = {'apple': 5600.00, 'orange': 3500.00, 'banana': 5000.00}
non_citric = {k: incomes[k] for k in incomes.keys() - {'orange'}}
non_citric
## {'apple': 5600.0, 'banana': 5000.0}
This code works because key-view objects support set operations like unions, intersections, and differences. When you wrote incomes.keys() - {‘orange’} inside the dictionary comprehension, you were really doing a set difference operation. If you need to perform any set operations with the keys of a dictionary, then you can just use the key-view object directly without first converting it into a set. This is a little-known feature of key-view objects that can be useful in some situations.
Sorting a Dictionary
It’s often necessary to sort the elements of a collection. Since Python 3.6, dictionaries are ordered data structures, so if you use Python 3.6 (and beyond), you’ll be able to sort the items of any dictionary by using sorted() and with the help of a dictionary comprehension:
incomes = {'apple': 5600.00, 'orange': 3500.00, 'banana': 5000.00}
sorted_income = {k: incomes[k] for k in sorted(incomes)}
sorted_income
## {'apple': 5600.0, 'banana': 5000.0, 'orange': 3500.0}
This code allows you to create a new dictionary with its keys in sorted order. This is possible because sorted(incomes) returns a list of sorted keys that you can use to generate the new dictionary sorted_dict.
Sorted by values
for value in sorted(incomes.values()):
print(value)
## 3500.0
## 5000.0
## 5600.0
Using Some of Python’s Built-In Functions
map()
Python’s map() is defined as map(function, iterable, …) and returns an iterator that applies function to every item of iterable, yielding the results on demand. So, map() could be viewed as an iteration tool that you can use to iterate through a dictionary in Python.
Suppose you have a dictionary containing the prices of a bunch of products, and you need to apply a discount to them. In this case, you can define a function that manages the discount and then uses it as the first argument to map(). The second argument can be prices.items():
prices = {'apple': 0.40, 'orange': 0.35, 'banana': 0.25}
def discount(current_price):
return (current_price[0], round(current_price[1] * 0.95, 2))
new_prices = dict(map(discount, prices.items()))
new_prices
## {'apple': 0.38, 'orange': 0.33, 'banana': 0.24}
filter()
filter() is another built-in function that you can use to iterate through a dictionary in Python and filter out some of its items. This function is defined as filter(function, iterable) and returns an iterator from those elements of iterable for which function returns True.
Suppose you want to know the products with a price lower than 0.40. You need to define a function to determine if the price satisfies that condition and pass it as first argument to filter(). The second argument can be prices.keys():
prices = {'apple': 0.40, 'orange': 0.35, 'banana': 0.25}
def has_low_price(price):
return prices[price] < 0.4
low_price = list(filter(has_low_price, prices.keys()))
low_price
## ['orange', 'banana']
Using collections.ChainMap
collections is a useful module from the Python Standard Library that provides specialized container data types. One of these data types is ChainMap, which is a dictionary-like class for creating a single view of multiple mappings (like dictionaries). With ChainMap, you can group multiple dictionaries together to create a single, updateable view.
Now, suppose you have two (or more) dictionaries, and you need to iterate through them together as one. To achieve this, you can create a ChainMap object and initialize it with your dictionaries:
from collections import ChainMap
fruit_prices = {'apple': 0.40, 'orange': 0.35}
vegetable_prices = {'pepper': 0.20, 'onion': 0.55}
chained_dict = ChainMap(fruit_prices, vegetable_prices)
chained_dict # A ChainMap object
## ChainMap({'apple': 0.4, 'orange': 0.35}, {'pepper': 0.2, 'onion': 0.55})
for key in chained_dict:
print(key, '->', chained_dict[key])
## pepper -> 0.2
## onion -> 0.55
## apple -> 0.4
## orange -> 0.35
After importing ChainMap from collections, you need to create a ChainMap object with the dictionaries you want to chain, and then you can freely iterate through the resulting object as you would do with a regular dictionary.
ChainMap objects also implement .keys(), values(), and .items() as a standard dictionary does, so you can use these methods to iterate through the dictionary-like object generated by ChainMap, just like you would do with a regular dictionary:
for key, value in chained_dict.items():
print(key, '->', value)
## pepper -> 0.2
## onion -> 0.55
## apple -> 0.4
## orange -> 0.35
Handling Missing Keys in Dictionaries
a_dict = {}
a_dict.setdefault('missing_key', 'default value')
## 'default value'
## 'default value'
a_dict.setdefault('missing_key', 'another default value')
## 'default value'
## {'missing_key': 'default value'}
In the above code, you use .setdefault() to generate a default value for missing_key. Notice that your dictionary, a_dict, now has a new key called missing_key whose value is ‘default value’. This key didn’t exist before you called .setdefault(). Finally, if you call .setdefault() on an existing key, then the call won’t have any effect on the dictionary. Your key will hold the original value instead of the new default value.
defaultdict Type for Handling Missing Keys
The Python standard library provides collections, which is a module that implements specialized container types. One of those is the Python defaultdict type, which is an alternative to dict that’s specifically designed to help you out with missing keys. defaultdict is a Python type that inherits from dict:hug
The Python defaultdict type behaves almost exactly like a regular Python dictionary, but if you try to access or modify a missing key, then defaultdict will automatically create the key and generate a default value for it. This makes defaultdict a valuable option for handling missing keys in dictionaries.
a_dict = {}
a_dict['missing_key']
Sometimes, you’ll use a mutable built-in collection (a list, dict, or set) as values in your Python dictionaries. In these cases, you’ll need to initialize the keys before first use, or you’ll get a KeyError. You can either do this process manually or automate it using a Python defaultdict. In this section, you’ll learn how to use the Python defaultdict type for solving some common programming problems:
- Grouping the items in a collection
- Counting the items in a collection
- Accumulating the values in a collection
You’ll be covering some examples that use list, set, int, and float to perform grouping, counting, and accumulating operations in a user-friendly and efficient way.
Grouping items
Grouping Items A typical use of the Python defaultdict type is to set .default_factory to list and then build a dictionary that maps keys to lists of values. With this defaultdict, if you try to get access to any missing key, then the dictionary runs the following steps:
Call list() to create a new empty list Insert the empty list into the dictionary using the missing key as key Return a reference to that list This allows you to write code like this:
from collections import defaultdict
dd = defaultdict(list)
dd['key'].append(1)
dd
## defaultdict(<class 'list'>, {'key': [1]})
## defaultdict(<class 'list'>, {'key': [1, 2]})
## defaultdict(<class 'list'>, {'key': [1, 2, 3]})
You can use defaultdict along with list to group the items in a sequence or a collection. Suppose that you’ve retrieved the following data from your company’s database:
import pandas as pd
d = {'Department': ['Sales', 'Sales', 'Accounting', 'Marketing', 'Marketing'], 'Employee Name': ['John Doe', 'Martin Smith', 'Jane Doe', 'Elizabeth Smith', 'Adam Doe']}
df = pd.DataFrame(data=d)
df
## Department Employee Name
## 0 Sales John Doe
## 1 Sales Martin Smith
## 2 Accounting Jane Doe
## 3 Marketing Elizabeth Smith
## 4 Marketing Adam Doe
With this data, you create an initial list of tuple objects like the following:
dep = [('Sales', 'John Doe'),
('Sales', 'Martin Smith'),
('Accounting', 'Jane Doe'),
('Marketing', 'Elizabeth Smith'),
('Marketing', 'Adam Doe')]
Now, you need to create a dictionary that groups the employees by department. To do this, you can use a defaultdict as follows:
from collections import defaultdict
dep_dd = defaultdict(list)
for department, employee in dep:
dep_dd[department].append(employee)
dep_dd
## defaultdict(<class 'list'>, {'Sales': ['John Doe', 'Martin Smith'], 'Accounting': ['Jane Doe'], 'Marketing': ['Elizabeth Smith', 'Adam Doe']})
Here, you create a defaultdict called dep_dd and use a for loop to iterate through your dep list. The statement dep_dd[department].append(employee) creates the keys for the departments, initializes them to an empty list, and then appends the employees to each department. Once you run this code, your dep_dd will look something like this:
In this example, you group the employees by their department using a defaultdict with .default_factory set to list. To do this with a regular dictionary, you can use dict.setdefault() as follows:
dep_d = dict()
for department, employee in dep:
dep_d.setdefault(department, []).append(employee)
This code is straightforward, and you’ll find similar code quite often in your work as a Python coder. However, the defaultdict version is arguably more readable, and for large datasets, it can also be a lot faster and more efficient. So, if speed is a concern for you, then you should consider using a defaultdict instead of a standard dict.
More info on defaultdict can be found here
