What is Dictionary in Python?
First of All it is not sequential like Lists. It is a non-sequential, unordered, redundant and mutable collection as key:value pairs. Keys are always unique but values need not be unique. You use the key to access the corresponding value. Where a list index is always a number, a dictionary key can be a different data type, like a string, integer, float, or even tuples.
- 1.Python dictionaries are unordered collections of key-value pairs.
- 2.They are mutable, meaning you can add, remove, and modify elements after creation.
- 3.Dictionary keys must be unique and immutable (e.g., strings, numbers, tuples).
- 4.while values can be of any data type.
The contents of a dict can be written as a series of key:value pairs within braces { }, e.g.
dict = {key1:value1, key2:value2, ... }.
The “empty dict” is just an empty pair of curly braces {}.
Differences between String and Dictionary
list = [1,’A’,2,3,2,8.7]
dict = {0:1,1:’A’,2:2,3:3,4:2,5:8.7}
List has numeric index only in sequential order starting from zero.
Dictionary is has keys instead of index which can be numeric, string or tuple. They are unordered but they are always unique.
0 1 2 3 4 # indexes are not there
{‘A’:’A’,’Apple’:253,12:3,13:435,445:34}
In Python, dictionaries can have integers as keys, and you can access individual elements using these keys. However, dictionaries do not support slicing like lists or strings because they are inherently unordered collections of key-value pairs.
Accessing Elements in a Dictionary
You can access individual elements in a dictionary by using the keys. Here’s an example:
# Define a dictionary with integer keys
my_dict = {1: 'one', 2: 'two', 3: 'three'}
# Access individual elements
print(my_dict[1]) # Output: 'one'
print(my_dict[2]) # Output: 'two'
print(my_dict[3]) # Output: 'three'
Iterating Over a Dictionary
You can also iterate over the keys, values, or key-value pairs in a dictionary:
# Iterate over keys
for key in my_dict:
print(key, my_dict[key])
# Iterate over values
for value in my_dict.values():
print(value)
# Iterate over key-value pairs
for key, value in my_dict.items():
print(key, value)
Checking if a Key Exists
You can check if a key exists in the dictionary:
if 1 in my_dict:
print("Key 1 is in the dictionary")
if 4 not in my_dict:
print("Key 4 is not in the dictionary")
Adding and Modifying Elements
You can add new key-value pairs or modify existing ones:
my_dict[4] = 'four' # Adding a new key-value pair
my_dict[2] = 'TWO' # Modifying an existing value
print(my_dict) # Output: {1: 'one', 2: 'TWO', 3: 'three', 4: 'four'}
Slicing (Not Supported)
Dictionaries do not support slicing. Slicing works with ordered collections like lists, tuples, and strings but not with dictionaries. If you need to work with a subset of a dictionary, you can use dictionary comprehensions or other techniques to filter the dictionary.
Example of Filtering a Dictionary
# Create a new dictionary with keys greater than 1
filtered_dict = {k: v for k, v in my_dict.items() if k > 1}
print(filtered_dict) # Output: {2: 'TWO', 3: 'three', 4: 'four'}
Complete Example
Here’s a complete example demonstrating various dictionary operations:
# Define a dictionary
my_dict = {1: 'one', 2: 'two', 3: 'three'}
# Access individual elements
print(my_dict[1]) # Output: 'one'
# Iterate over keys
for key in my_dict:
print(key, my_dict[key])
# Check if a key exists
if 1 in my_dict:
print("Key 1 is in the dictionary")
# Add and modify elements
my_dict[4] = 'four'
my_dict[2] = 'TWO'
print(my_dict) # Output: {1: 'one', 2: 'TWO', 3: 'three', 4: 'four'}
# Filter dictionary
filtered_dict = {k: v for k, v in my_dict.items() if k > 1}
print(filtered_dict) # Output: {2: 'TWO', 3: 'three', 4: 'four'}
Functions for Dictionaries
Method | Syntax | Description | Example |
clear() | dict.clear() | Removes all items from the dictionary. | my_dict = {‘a’: 1, ‘b’: 2} my_dict.clear() |
copy() | dict.copy() | Returns a shallow copy of the dictionary. | my_dict = {‘a’: 1, ‘b’: 2} new_dict = my_dict.copy() |
get() | dict.get(key[, default]) | Returns the value for the specified key. If the key is not found, returns the default value or None. | my_dict = {‘a’: 1, ‘b’: 2} value = my_dict.get(‘a’) |
items() | dict.items() | Returns a view object that displays a list of dictionary’s (key, value) tuple pairs. | my_dict = {‘a’: 1, ‘b’: 2} items = my_dict.items() |
keys() | dict.keys() | Returns a view object that displays a list of all the keys in the dictionary. | my_dict = {‘a’: 1, ‘b’: 2} keys = my_dict.keys() |
values() | dict.values() | Returns a view object that displays a list of all the values in the dictionary. | my_dict = {‘a’: 1, ‘b’: 2} values = my_dict.values() |
pop() | dict.pop(key[, default]) | Removes the item with the specified key and returns its value. If the key is not found, returns default (or raises KeyError if not provided). | my_dict = {‘a’: 1, ‘b’: 2} value = my_dict.pop(‘a’) |
popitem() | dict.popitem() | Removes and returns an arbitrary (key, value) pair from the dictionary. | my_dict = {‘a’: 1, ‘b’: 2} item = my_dict.popitem() |
update() | dict.update([other]) | Updates the dictionary with elements from another dictionary or from an iterable of (key, value) pairs. | my_dict = {‘a’: 1, ‘b’: 2} my_dict.update({‘c’: 3}) |
fromkeys() | dict.fromkeys(seq[, value]) | Creates a new dictionary with keys from the given sequence and values set to a default value (None by default). | keys = [‘a’, ‘b’, ‘c’] my_dict =dict.fromkeys(keys) |
Nested Dictionaries
A nested dictionary is a dictionary within another dictionary. This can be used to represent hierarchical data.
nested_dict = {
'person1': {
'name': 'Alice',
'age': 30,
'address': {
'city': 'New York',
'zipcode': '10001'
}
},
'person2': {
'name': 'Bob',
'age': 25,
'address': {
'city': 'San Francisco',
'zipcode': '94105'
}
}
}
print(nested_dict['person1']['address']['city']) # Output: New York
Dictionaries with Tuples as Keys
Tuples can be used as keys in dictionaries because they are immutable.
tuple_key_dict = {
('John', 'Doe'): 1234,
('Jane', 'Doe'): 5678
}
print(tuple_key_dict[('John', 'Doe')]) # Output: 1234
List of Dictionaries
control_data = [
{"step_number": 1, "step_name": "Step 1", "run_step": True, "run_on_failure": True, "retry_count": 2},
{"step_number": 2, "step_name": "Step 2", "run_step": True, "run_on_failure": False, "retry_count": 1},
{"step_number": 3, "step_name": "Step 3", "run_step": True, "run_on_failure": True, "retry_count": 2},
{"step_number": 4, "step_name": "Step 4", "run_step": True, "run_on_failure": False, "retry_count": 1},
]
We can create a PySpark DataFrame from the provided data:
from pyspark.sql import SparkSession
# Create a SparkSession
spark = SparkSession.builder.appName("Control Data").getOrCreate()
# Define the data
control_data = [
{"step_number": 1, "step_name": "Step 1", "run_step": True, "run_on_failure": True, "retry_count": 2},
{"step_number": 2, "step_name": "Step 2", "run_step": True, "run_on_failure": False, "retry_count": 1},
{"step_number": 3, "step_name": "Step 3", "run_step": True, "run_on_failure": True, "retry_count": 2},
{"step_number": 4, "step_name": "Step 4", "run_step": True, "run_on_failure": False, "retry_count": 1},
]
# Create a DataFrame
df = spark.createDataFrame(control_data)
# Show the DataFrame
df.show()
Using a generator expression with Row(**x) for x in control_data
is a concise and efficient way to create a DataFrame from the control_data
list.
Here’s the complete code:
from pyspark.sql import SparkSession
from pyspark.sql import Row
# Create a SparkSession
spark = SparkSession.builder.appName("Control Data").getOrCreate()
# Define the data
control_data = [
{"step_number": 1, "step_name": "Step 1", "run_step": True, "run_on_failure": True, "retry_count": 2},
{"step_number": 2, "step_name": "Step 2", "run_step": True, "run_on_failure": False, "retry_count": 1},
{"step_number": 3, "step_name": "Step 3", "run_step": True, "run_on_failure": True, "retry_count": 2},
{"step_number": 4, "step_name": "Step 4", "run_step": True, "run_on_failure": False, "retry_count": 1},
]
# Convert control data to a DataFrame
control_table_df = spark.createDataFrame(Row(**x) for x in control_data)
# Show the DataFrame
control_table_df.show()
This will produce the same output as before:
+-----------+---------+---------+-------------+-----------+
|step_number|step_name|run_step|run_on_failure|retry_count|
+-----------+---------+---------+-------------+-----------+
| 1| Step 1| true| true| 2|
| 2| Step 2| true| false| 1|
| 3| Step 3| true| true| 2|
| 4| Step 4| true| false| 1|
+-----------+---------+---------+-------------+-----------+
# using the Row class and generator expression to create the DataFrame!
In Python, a generator expression is a concise way to create a generator, which is an iterable object that produces a sequence of values on-the-fly, without storing them all in memory at once.
Row(**x)
is an example of a generator expression, where:
Row
is a class from thepyspark.sql
module, representing a row in a DataFrame.**x
is unpacking the dictionaryx
into keyword arguments for theRow
class.
When you use Row(**x) for x in control_data
, it’s equivalent to:
for x in control_data:
yield Row(**x)
This creates a generator that produces Row
objects, one for each dictionary in control_data
. The **x
syntax unpacks each dictionary into keyword arguments for the Row
constructor, creating a new Row
object with the corresponding values.
Generator expressions are useful when working with large datasets, as they:
- Use less memory, since only one value is produced at a time.
- Allow for lazy evaluation, meaning values are only produced when needed.
- Can be used in a variety of contexts, such as creating DataFrames, lists, or other iterables.
In this specific case, the generator expression Row(**x) for x in control_data
creates a sequence of Row
objects, which is then passed to spark.createDataFrame()
to create a DataFrame.
Dictionary with Mixed Data Types
Dictionaries can store values of different data types, including lists and other dictionaries.
mixed_dict = {
'id': 1,
'name': 'Alice',
'grades': [85, 90, 78],
'attributes': {
'height': 5.5,
'weight': 135
},
'enrolled': True
}
print(mixed_dict['grades']) # Output: [85, 90, 78]
Dictionary with Functions as Values
Dictionaries can store functions as values, allowing you to map keys to specific operations.
def add(a, b):
return a + b
def subtract(a, b):
return a - b
operations = {
'add': add,
'subtract': subtract
}
print(operations['add'](10, 5)) # Output: 15
print(operations['subtract'](10, 5)) # Output: 5
Dictionary Comprehensions
Dictionary comprehensions provide a concise way to create dictionaries.
squares = {x: x**2 for x in range(1, 6)}
print(squares) # Output: {1: 1, 2: 4, 3: 9, 4: 16, 5: 25}
Using Defaultdict from Collections
defaultdict
allows for default values for keys that have not been set yet.
from collections import defaultdict
default_dict = defaultdict(list)
default_dict['fruits'].append('apple')
default_dict['fruits'].append('banana')
default_dict['vegetables'].append('carrot')
print(default_dict) # Output: defaultdict(<class 'list'>, {'fruits': ['apple', 'banana'], 'vegetables': ['carrot']})
OrderedDict from Collections
OrderedDict
keeps the order of items as they are added.
from collections import OrderedDict
ordered_dict = OrderedDict()
ordered_dict['a'] = 1
ordered_dict['b'] = 2
ordered_dict['c'] = 3
print(ordered_dict) # Output: OrderedDict([('a', 1), ('b', 2), ('c', 3)])
These examples showcase the versatility and power of Python dictionaries in handling complex data structures and various use cases.
some coding questions on Python dictionaries
Here are some coding questions on Python dictionaries that are commonly asked in technical interviews. These questions range from basic operations to more complex tasks involving dictionaries:
Basic Level
Check if Key Exists:
def key_exists(d, key):
return key in d
# Example:
my_dict = {'a': 1, 'b': 2, 'c': 3}
print(key_exists(my_dict, 'b')) # Output: True
print(key_exists(my_dict, 'd')) # Output: False
Dictionary from Two Lists:
def lists_to_dict(keys, values):
return dict(zip(keys, values))
# Example:
keys = ['a', 'b', 'c']
values = [1, 2, 3]
print(lists_to_dict(keys, values)) # Output: {'a': 1, 'b': 2, 'c': 3}
Remove a Key from Dictionary:
def remove_key(d, key):
if key in d:
del d[key]
return d
# Example:
my_dict = {'a': 1, 'b': 2, 'c': 3}
print(remove_key(my_dict, 'b')) # Output: {'a': 1, 'c': 3}
Intermediate Level
Count Word Frequency:
def word_frequency(s):
words = s.split()
freq = {}
for word in words:
if word in freq:
freq[word] += 1
else:
freq[word] = 1
return freq
# Example:
text = "hello world hello"
print(word_frequency(text))
# Output: {'hello': 2, 'world': 1}
def word_frequency(s):
words = s.split()
wordlist = []
count = []
for word in words:
if word in wordlist:
index = wordlist.index(word)
count[index] += 1
else:
wordlist.append(word)
count.append(1)
return count, wordlist
# Example:
text = "hello world hello"
print(word_frequency(text))
def word_frequency(s):
words = s.split()
wordlist = []
count = []
for i, word in enumerate(words):
if word in wordlist:
index = wordlist.index(word)
count[index] += 1
else:
wordlist.append(word)
count.append(1)
return count, wordlist
# Example:
text = "hello world hello"
print(word_frequency(text))
Merge Dictionaries:
def merge_dicts(d1, d2):
result = d1.copy()
for key, value in d2.items():
if key in result:
if isinstance(result[key], list):
result[key].append(value)
else:
result[key] = [result[key], value]
else:
result[key] = value
return result
# Example:
dict1 = {'a': 1, 'b': 2}
dict2 = {'b': 3, 'c': 4}
print(merge_dicts(dict1, dict2))
# Output: {'a': 1, 'b': [2, 3], 'c': 4}
Group Anagrams:
def group_anagrams(words):
anagrams = {}
for word in words:
sorted_word = ''.join(sorted(word))
if sorted_word in anagrams:
anagrams[sorted_word].append(word)
else:
anagrams[sorted_word] = [word]
return list(anagrams.values())
# Example:
words = ["eat", "tea", "tan", "ate", "nat", "bat"]
print(group_anagrams(words))
# Output: [['eat', 'tea', 'ate'], ['tan', 'nat'], ['bat']]
Advanced Level
Find All Paths in a Graph:
def find_all_paths(graph, start, end, path=[]):
path = path + [start]
if start == end:
return [path]
if start not in graph:
return []
paths = []
for node in graph[start]:
if node not in path:
new_paths = find_all_paths(graph, node, end, path)
for p in new_paths:
paths.append(p)
return paths
# Example:
graph = {
'A': ['B', 'C'],
'B': ['C', 'D'],
'C': ['D'],
'D': ['C'],
'E': ['F'],
'F': ['C']
}
print(find_all_paths(graph, 'A', 'D'))
# Output: [['A', 'B', 'C', 'D'], ['A', 'B', 'D'], ['A', 'C', 'D']]
Longest Substring Without Repeating Characters:
def longest_substring_without_repeating(s):
char_index = {}
start = max_length = 0
for i, char in enumerate(s):
if char in char_index and start <= char_index[char]:
start = char_index[char] + 1
else:
max_length = max(max_length, i - start + 1)
char_index[char] = i
return max_length
# Example:
print(longest_substring_without_repeating("abcabcbb")) # Output: 3
print(longest_substring_without_repeating("bbbbb")) # Output: 1
Flatten a Nested Dictionary:
def flatten_dict(d, parent_key='', sep='.'):
items = []
for k, v in d.items():
new_key = f"{parent_key}{sep}{k}" if parent_key else k
if isinstance(v, dict):
items.extend(flatten_dict(v, new_key, sep=sep).items())
else:
items.append((new_key, v))
return dict(items)
# Example:
nested_dict = {
'a': 1,
'b': {
'c': 2,
'd': {
'e': 3
}
}
}
print(flatten_dict(nested_dict))
# Output: {'a': 1, 'b.c': 2, 'b.d.e': 3}
LRU Cache Implementation:
from collections import OrderedDict
class LRUCache:
def __init__(self, capacity: int):
self.cache = OrderedDict()
self.capacity = capacity
def get(self, key: int) -> int:
if key not in self.cache:
return -1
else:
self.cache.move_to_end(key)
return self.cache[key]
def put(self, key: int, value: int) -> None:
if key in self.cache:
self.cache.move_to_end(key)
self.cache[key] = value
if len(self.cache) > self.capacity:
self.cache.popitem(last=False)
# Example:
lru_cache = LRUCache(2)
lru_cache.put(1, 1)
lru_cache.put(2, 2)
print(lru_cache.get(1)) # Output: 1
lru_cache.put(3, 3)
print(lru_cache.get(2)) # Output: -1 (evicted because of capacity)
print(lru_cache.get(3)) # Output: 3
Dynamic list of variable creation and saved in dictionary for repeating use in python
To dynamically create variables and store them in a dictionary for repeated use in Python, you can use a loop and dictionary operations. This approach allows you to create a flexible number of variables, which can then be accessed and manipulated as needed.
Here’s an example demonstrating how to achieve this:
Example 1: Creating and Storing Variables in a Dictionary
# Initialize an empty dictionary to store variables
variables = {}
# Dynamically create variables and store them in the dictionary
for i in range(1, 6): # Create 5 variables as an example
var_name = f"var{i}" # Generate a variable name like var1, var2, etc.
variables[var_name] = i * 10 # Assign a value to the variable
# Access and use the variables
for key, value in variables.items():
print(f"{key} = {value}")
# Output:
# var1 = 10
# var2 = 20
# var3 = 30
# var4 = 40
# var5 = 50
In this example:
- We initialize an empty dictionary
variables
. - We use a loop to dynamically create variable names and assign them values.
- We store these variables and their values in the
variables
dictionary. - We then iterate over the dictionary to access and use the variables.
Example 2: Using Dynamic Variables in Functions
You can also use this approach to dynamically create variables and use them within functions.
def create_variables(n):
variables = {}
for i in range(1, n + 1):
var_name = f"var{i}"
variables[var_name] = i * 10
return variables
def use_variables(variables):
for key, value in variables.items():
print(f"{key} = {value}")
# Create 5 variables dynamically
dynamic_vars = create_variables(5)
# Use the created variables
use_variables(dynamic_vars)
# Output:
# var1 = 10
# var2 = 20
# var3 = 30
# var4 = 40
# var5 = 50
In this example:
- The
create_variables
function dynamically creates variables and stores them in a dictionary. - The
use_variables
function takes the dictionary of variables and prints them.
Example 3: More Complex Data Structures
You can also store more complex data structures, such as lists or dictionaries, as dynamic variables.
# Initialize an empty dictionary to store variables
variables = {}
# Dynamically create variables and store them in the dictionary
for i in range(1, 4): # Create 3 complex variables as an example
var_name = f"list_var{i}"
variables[var_name] = [j * i for j in range(1, 6)] # Create a list for each variable
# Access and use the variables
for key, value in variables.items():
print(f"{key} = {value}")
# Output:
# list_var1 = [1, 2, 3, 4, 5]
# list_var2 = [2, 4, 6, 8, 10]
# list_var3 = [3, 6, 9, 12, 15]
In this example:
- We dynamically create lists and store them in the
variables
dictionary. - Each list contains values that depend on the loop iteration, demonstrating the flexibility of this approach.
Using a dictionary to store dynamically created variables allows for easy access, modification, and reuse of those variables throughout your program. This technique can be particularly useful in scenarios where the number of variables or their names are not known in advance.
Leave a Reply