Python ALL Eyes on Strings- String Data Type & For Loop Combined


— by

It is a case sensitive, non-mutable sequence of characters marked under quotation. It can contain alphabets, digits, white spaces and special characters.

In Python, a string is a sequence of characters enclosed within either single quotes (‘ ‘), double quotes (” “), or triple quotes (”’ ”’ or “”” “””). You can’t mix single and double quotes in the same string or you’ll get a syntax error. 

Single quotes

my_string1 = ‘Hello, World!’

Double quotes

my_string2 = “Python Programming”

Triple quotes for multiline strings

my_string3 = ”’This is a
multiline
string.”’

Three pair of quotes are used when you have to define a multiline string or single quotes are part of string. A backslash ” can also be used to treat quotes as characters under string.

xyz = “Don’t Books Ram’s “
xyz = ‘Don’t Books Ram’s’

Behind the Stage— how string datatypes are stored in memory in Python and how assignment to variables works??

When a string is created in Python, the following happens in memory:

  1. Object Creation: A string object is created in memory to hold the string data.
  2. Character Storage: The individual characters of the string are stored contiguously in memory as a sequence of Unicode code points.
  3. Reference Counting: Python keeps track of how many references (variables or other objects) point to the string object.

When you assign a string to a variable, the following happens:

  1. Variable Creation: A variable is created in the current namespace (e.g., global or local).
  2. Reference Assignment: The variable is assigned a reference (a memory address) to the string object that was created earlier.

Key Points:

  • The variable itself doesn’t store the string data directly; it simply holds a reference to the string object in memory.
  • Multiple variables can refer to the same string object, and changing one variable won’t affect the others (because strings are immutable).
  • When the reference count of a string object reaches zero (no variables or objects are pointing to it), the object is eligible for garbage collection and its memory is freed.

x=y=z="oh Boy"

or 

x="oh Boy" 

y=x 

z=y

Both x = y = z = "oh Boy" and x = "oh Boy"; y = x; z = y achieve essentially the same result in Python, but with slightly different approaches. Let’s analyze them:

1. x = y = z = "oh Boy" (Multiple Assignment)

  • As explained earlier, this assigns the string “oh Boy” to all three variables (xy, and z) simultaneously. They all become references to the same string object in memory.

2. x = "oh Boy"; y = x; z = y (Sequential Assignment)

  • x = "oh Boy": First, the variable x is assigned the string “oh Boy”.
  • y = x: Then, the value of x (which is the reference to the string “oh Boy”) is assigned to y. Now, both x and y point to the same string object.
  • z = y: Finally, the value of y (again, the reference to the string “oh Boy”) is assigned to z. All three variables now reference the same string.

Key Difference

The primary difference lies in how the assignments are performed:

  • Multiple assignment is a more concise way to assign the same value to multiple variables in a single step.
  • Sequential assignment breaks down the process into individual steps, which can be useful for readability or if you need to perform additional operations between assignments.

Strings are immutable, meaning once they are created, their contents cannot be changed.

The error message “TypeError: ‘str’ object does not support item assignment” means you’re trying to change a character within a string using indexing. In Python, strings are immutable, meaning you can’t modify them directly after they’re created.

Table of Contents

Few key things that make strings special in Python:

1. Immutability:

Unlike many other data types in Python where you can modify the contents of a variable, strings are immutable. This means that once a string is created, its content cannot be changed.

For instance:

message = "Hello, world!"
# Trying to modify a character
message[0] = 'X'  # This will cause an error!

# You can reassign the variable to a new string
modified_message = 'X' + message[1:]
print(modified_message)  # Output: Xello, world!

In this example, attempting to directly change the first character of message results in an error. However, you can create a new string (modified_message) by combining characters or using string methods like slicing.

s='RammaR'

s[2] ='e' 

why it gives error?

The code s='RammaR' s[2] ='e' gives an error because strings in Python are immutable. This means you cannot change individual characters of a string in-place.

When you try to assign ‘e’ to s[2], you are attempting to modify the existing string object, which is not allowed.

To achieve a similar result, you would need to create a new string with the desired modification. For example:

s = 'RammaR'
s = s[:2] + 'e' + s[3:]  # Replace the character at index 2 with 'e'
print(s)  # Output: RameaR

if string in Python is immutable then how come

s=’RammaR’

s=s.replace(‘R’,’r’).lower() works

In Python, strings are immutable, meaning you can’t change the existing string object.

When you write s = s.replace('R', 'r').lower(), you are not modifying the original string ‘RammaR’. Instead, you are creating a new string object with the replaced and lowercased characters and assigning it back to the variable s. The original string ‘RammaR’ remains unchanged in memory, but the variable s now points to the newly created string.

Here’s a step-by-step breakdown of what happens:

  1. Initial string: s is initially set to 'RammaR'.
  2. Replacement operation: s.replace('R', 'r') creates a new string 'rammar' where all occurrences of 'R' are replaced with 'r'.
  3. Lowercase operation: .lower() is called on 'rammar', which converts it to 'rammar' (in this case, it remains the same because all characters are already lowercase).
  4. Reassignment: The new string 'rammar' is assigned back to s.

The original string 'RammaR' remains in memory until it is no longer referenced by any variable. Once there are no references to 'RammaR', it becomes eligible for garbage collection, meaning the memory can be reclaimed by the Python runtime. This process happens automatically, so you generally don’t need to worry about manually freeing memory.

If you want to ensure that the old string is no longer referenced, you can explicitly delete the reference using the del statement:

del s

However, in this case, since s is reassigned, the reference to the old string is already removed.

2. Sequence of Characters:

Strings are essentially ordered sequences of characters. Each character has its own index position, starting from 0. This allows you to access and manipulate individual characters or substrings within the string using indexing and slicing techniques.

3. Rich Set of Built-in Methods:

Python provides a comprehensive library of built-in string methods that empower you to perform various operations on strings. These methods cover aspects like:

  • Case conversion (upper(), lower())
  • Searching (find(), count())
  • Modification (replace(), split(), join())
  • Extraction (strip())
  • Validation (isalnum(), isalpha())

These methods make string manipulation efficient and avoid the need to write complex loops or functions for common tasks.

4. Unicode Support:

Python strings are inherently unicode strings, meaning they can represent a wide range of characters from different languages and alphabets. This makes Python strings versatile for handling text data from various cultural contexts.

Some commonly used string functions in Python:

  1. Conversion Functions:
    • upper(): Converts all characters in the string to uppercase.
    • lower(): Converts all characters in the string to lowercase.
    • capitalize(): Capitalizes the first character of the string.
    • title(): Converts the first character of each word to uppercase.
  2. Search and Replace Functions:
    • find(substring): Returns the lowest index in the string where substring is found.
    • rfind(substring): Returns the highest index in the string where substring is found.
    • index(substring): Like find(), but raises ValueError if the substring is not found.
    • rindex(substring): Like rfind(), but raises ValueError if the substring is not found.
    • count(substring): Returns the number of occurrences of substring in the string.
    • replace(old, new): Returns a copy of the string with all occurrences of substring old replaced by new.
  3. Substring Functions:
    • startswith(prefix): Returns True if the string starts with the specified prefix, otherwise False.
    • endswith(suffix): Returns True if the string ends with the specified suffix, otherwise False.
    • strip(): Removes leading and trailing whitespace.
    • lstrip(): Removes leading whitespace.
    • rstrip(): Removes trailing whitespace.
    • split(sep): Splits the string into a list of substrings using the specified separator.
    • rsplit(sep): Splits the string from the right end.
    • partition(sep): Splits the string into three parts using the specified separator. Returns a tuple with (head, separator, tail).
    • rpartition(sep): Splits the string from the right end.
  4. String Formatting Functions:
    • format(): Formats the string.
    • join(iterable): Concatenates each element of the iterable (such as a list) to the string.
  5. String Testing Functions:
    • isalpha(): Returns True if all characters in the string are alphabetic.
    • isdigit(): Returns True if all characters in the string are digits.
    • isalnum(): Returns True if all characters in the string are alphanumeric (letters or numbers).
    • isspace(): Returns True if all characters in the string are whitespace.
  6. Miscellaneous Functions:
    • len(): Returns the length of the string.
    • ord(): Returns the Unicode code point of a character.
    • chr(): Returns the character that corresponds to the Unicode code point.

Looping over a String

Strings are objects that contain a sequence of single-character strings.

A single letter is classified as a string in Python. For example, string[0] is considered a string even though it is just a single character.

Here’s how you can do it-Loooping Over a String:

my_string = "Hello, World!"

for char in my_string:
print(char)

In Python, you can use a for loop to iterate over each character in a string. To loop over a string means to start with the first character in a string(Position 0) and iterate over each character until the end of the string( Position- Length-1).

#to get commonLetters from two string with case and duplicates ignored and the result #sorted /Or Not sorted alpabetically

def commonLetters(str1,str2):
    common = ""
    for i in str1:
        if i in str2 and i not in common:
            common += i
    return "".join(sorted(common))
def commonLettersnosort(str1,str2):
    common = ""
    for i in str1:
        if i in str2 and i not in common:
            common += i
    return "".join(common)
def commonLettersnosortnocase(str1,str2):
    common = ""
    for i in str1:
        if i.upper() in str2.upper() and i not in common:
            common += i
    return "".join(common)

Check for - 
commonLettersnosort('shyam','Ghanshyam') commonLettersnosortnocase('shyam','GhanShYam')

Slicing Strings:

You can slice strings using the syntax [start:stop:step], where:

  • start: The starting index of the slice (inclusive).
  • stop: The ending index of the slice (exclusive).
  • step: The step or increment between characters (optional).

If you try to access an index that’s larger than the length of your string, you’ll get an IndexError. This is because you’re trying to access something that doesn’t exist!

You can also access indexes from the end of the string going towards the start of the string by using negative values. The index [-1] would access the last character of the string, and the index [-2] would access the second-to-last character.

Example:

my_string = "Python Programming"

# Slicing from index 7 to the end
substring1 = my_string[7:]
print(substring1)  # Output: Programming

# Slicing from index 0 to 6
substring2 = my_string[:6]
print(substring2)  # Output: Python

# Slicing from index 7 to 13 with step 2
substring3 = my_string[7:13:2]
print(substring3)  # Output: Porm
string1 = "Greetings, Earthlings"
print(string1[0])   # Prints “G”
print(string1[4:8]) # Prints “ting”
print(string1[11:]) # Prints “Earthlings”
print(string1[:5])  # Prints “Greet”

If your index is beyond the end of the string, Python returns an empty string.

An optional way to slice an index is by the stride argument, indicated by using a double colon.

This allows you to skip over the corresponding number of characters in your index, or if you’re using a negative stride, the string prints backwards.

print(string1[0::2])    # Prints “Getns atlns”

print(string1[::-1])    # Prints “sgnilhtraE ,sgniteerG”

Using the str.format() method String Formatting

The str.format() method is a powerful tool in Python for string formatting. It allows you to create dynamic strings by inserting values into placeholders within a string template. Here’s a basic overview of how it works:

Basic Usage:

You start with a string containing placeholder curly braces {} where you want to insert values, and then you call the format() method on that string with the values you want to insert.

Example:

name = "John"
age = 30
print("My name is {} and I am {} years old.".format(name, age))
My name is John and I am 30 years old.

Positional Arguments:

You can pass values to the format() method in the order that corresponds to the order of the placeholders in the string.

Example:

print("My name is {} and I am {} years old.".format("Alice", 25))
My name is Alice and I am 25 years old.

Keyword Arguments:

Alternatively, you can pass values using keyword arguments, where the keys match the names of the placeholders.

Example:

print("My name is {name} and I am {age} years old.".format(name="Bob", age=28))

Output:

My name is Bob and I am 28 years old.

Formatting:

You can also specify formatting options within the placeholders to control the appearance of the inserted values, such as precision for floating-point numbers or padding for strings.

Example:

pi = 3.14159
print("The value of pi is {:.2f}".format(pi))

Output:

The value of pi is 3.14

Padding and Alignment:-

You can align strings and pad them with spaces or other characters.


left_aligned = "{:<10}".format("left")
right_aligned = "{:>10}".format("right")
center_aligned = "{:^10}".format("center")
print(left_aligned)
print(right_aligned)
print(center_aligned)

Accessing Arguments by Position:

You can access arguments out of order and even multiple times by specifying their positions within the curly braces.

Example:

print("{1} {0} {1}".format("be", "or", "not"))

Output:

or be or

Guess the ERROR:-


Using Dictionary for Named Placeholders:

You can use a dictionary to specify values for named placeholders.

Example:

data = {'name': 'Sam', 'age': 35}
print("My name is {name} and I am {age} years old.".format(**data))

Output:

My name is Sam and I am 35 years old.

The str.format() method provides great flexibility and readability for string formatting in Python, making it a versatile tool for a wide range of use cases. You can also put a formatting expression inside the curly brackets, which lets you alter the way the string is formatted. For example, the formatting expression {:.2f} means that you’d format this as a float number, with two digits after the decimal dot. The colon acts as a separator from the field name, if you had specified one. You can also specify text alignment using the greater than operator: >. For example, the expression {:>3.2f} would align the text three spaces to the right, as well as specify a float number with two decimal places.

# Inserting values
print("My name is {} and I'm {} years old.".format("John", 30))

# Named placeholders
print("My name is {name} and I'm {age} years old.".format(name="John", age=30))

# Number formatting
print("{:,}".format(1234567))  # Output: 1,234,567
print("{:.2f}".format(123.4567))  # Output: 123.46
print("{:.0%}".format(0.1234))  # Output: 12%

# Alignment and padding
print("{:<10}".format("left-aligned"))  # Output: left-aligned   
print("{:>10}".format("right-aligned"))  # Output:   right-aligned
print("{:^10}".format("centered"))  # Output:   centered  
print("{:*^10}".format("centered"))  # Output: ***centered***

# Date and time formatting
from datetime import datetime, timedelta

# Current date and time
print("{:%Y-%m-%d %H:%M:%S}".format(datetime.now()))

# Specific date and time
print("{:%Y-%m-%d %H:%M:%S}".format(datetime(2022, 12, 31, 23, 59, 59)))

# Date and time with timezone
from pytz import timezone
print("{:%Y-%m-%d %H:%M:%S %Z}".format(datetime.now(timezone('US/Eastern'))))

# Date and time with specific format
print("{:%B %d, %Y}".format(datetime.now()))  # Output: July 29, 2024

# Time duration
print("{:%H hours, %M minutes, %S seconds}".format(timedelta(hours=12, minutes=30, seconds=45)))

# Hexadecimal and binary formatting
print("{:x}".format(123))  # Output: 7b
print("{:b}".format(123))  # Output: 1111011

# Conditional formatting
print("The answer is {answer}.".format(answer="yes" if True else "no"))

# Multiline strings
print("""
{title}
{line}
{body}
""".format(title="Title", line="-" * 50, body="This is a multiline string."))

# Repeating strings
print("Hello " * 5)  # Output: Hello Hello Hello Hello Hello

# Slicing strings
print("Hello World"[0:5])  # Output: Hello

# Case conversion
print("hello world".upper())  # Output: HELLO WORLD
print("HELLO WORLD".lower())  # Output: hello world
print("hello world".title())  # Output: Hello World

# String concatenation
print("Hello " + "World")  # Output: Hello World
print(" ".join(["Hello", "World"]))  # Output: Hello World

# String formatting with dictionaries
person = {"name": "John", "age": 30}
print("My name is {name} and I'm {age} years old.".format(**person))

# String formatting with lists
numbers = [1, 2, 3]
print("The numbers are {0}, {1}, and {2}.".format(*numbers))

# String formatting with custom objects
class Person:
    def __init__(self, name, age):
        self.name = name
        self.age = age

person = Person("John", 30)
print("My name is {name} and I'm {age} years old.".format(**person.__dict__))

Using f-Strings (Formatted String Literals)

Introduced in Python 3.6, f-strings provide a concise and readable way to embed expressions inside string literals.

name = "Bob"
age = 25
formatted_string = f"My name is {name} and I am {age} years old."
print(formatted_string)
from datetime import datetime, timedelta

# Get the current year and month
current_year = datetime.now().year
current_month = datetime.now().month

# Create a list of months and years
months_years = [(current_year, month) for month in range(1, 13)]
print("List of months for the current year:")
for year, month in months_years:
    print(f"Year: {year}, Month: {month}")

# Alternatively, create a list for the next 12 months starting from the current month
months_years_dynamic = [(current_year + (current_month + i - 1) // 12, (current_month + i - 1) % 12 + 1) for i in range(12)]
print("nDynamic list of months starting from the current month:")
for year, month in months_years_dynamic:
    print(f"Year: {year}, Month: {month}")
# Generate report titles for each month of the current year
report_titles = [f"Report for {datetime(current_year, month, 1).strftime('%B %Y')}" for month in range(1, 13)]
print("nReport Titles for Each Month:")
for title in report_titles:
    print(title)

months = ["January", "February", "March", "April", "May", "June",
         "July", "August", "September", "October", "November", "December"]
years = range(2023, 2025)  # Adjust range as needed

month_year_variables = [f"{month}_{year}" for month in months for year in years]
print(month_year_variables)

# Output: ['January_2023', 'February_2023', ..., 'December_2024']

month_dict = {
    1: "January", 2: "February", 3: "March", 4: "April", 5: "May", 6: "June",
    7: "July", 8: "August", 9: "September", 10: "October", 11: "November", 12: "December"
}

years = range(2023, 2025)

month_year_variables = [f"{month_dict[month]}_{year}" for month in years for year in range(1, 13)]
print(month_year_variables)

# Output: ['January_2023', 'February_2023', ..., 'December_2024']

Formatting Numbers

You can format numbers directly within f-strings.

number = 1234.56789
formatted_number = f"Formatted number: {number:.2f}"
print(formatted_number)

Formatting Using the % Operator

An older method but still in use, the % operator allows for simple string formatting.

name = "Charlie"
age = 22
formatted_string = "My name is %s and I am %d years old." % (name, age)
print(formatted_string)

Split and Join Functions- Really great!!

1. split() function:

The split() function splits a string into a list of substrings based on a specified separator.

Syntax: list_of_words = string.split(separator)

Examples:

# Split a sentence by spaces text = "This is a string to be split."

word_list = text.split() print(word_list) 

# Output: ['This', 'is', 'a', 'string', 'to', 'be', 'split.']

# Split a CSV string by commas

csv_data = "name,age,citynAlice,30,New YorknBob,25,Los Angeles" 

data_list = csv_data.split("n")

# Split by newlines first for row in data_list:

fields = row.split(",") # Split each row by commas within the loop print(fields) 

# Output: [['name', 'age', 'city'], ['Alice', '30', 'New York'], ['Bob', '25', 'Los Angeles']] # Split with a custom delimiter

code_snippet = "print('Hello, world!')" 
words = code_snippet.split("'") 
# Split by single quotes print(words) 

# Output: ['print(', 'Hello, world!', ')']

string: The string you want to split.

separator (optional): The delimiter used to split the string. If not specified, whitespace (spaces, tabs, newlines) is used by default.

2. join() function:

The join() function joins the elements of an iterable (like a list) into a single string, inserting a separator between each element.

Syntax:joined_string = separator.join(iterable)

separator: The string to insert between elements.

iterable: The iterable (list, tuple, etc.) containing the elements to join.

Examples:

words = ["Hello", "world", "how", "are", "you?"] joined_text = " ".join(words) 

# Join with spaces print(joined_text) 

# Output: Hello world how are you? 

# Join with a custom separator 

data = ["apple", "banana", "cherry"] 

comma_separated = ",".join(data) 

# Join with commas 

print(comma_separated)

 # Output: apple,banana,cherry 

# Join lines for a multi-line 

string lines = ["This is line 1.", "This is line 2."]

 multiline_text = "n".join(lines) 

print(multiline_text) 

# Output: This is line 1. 

# This is line 2.

Key Points:

  • Both split() and join() work with strings and iterables.
  • split() returns a list of substrings, while join() returns a single string.
  • You can use custom separators for both functions.
  • These functions are versatile for various string manipulation tasks.

Data cleaning and Manipulation using String Functions

1. Stripping Whitespace

Removing leading and trailing whitespace from a string.

data = "   Hello, World!   "
cleaned_data = data.strip()
print(f"'{cleaned_data}'")  # Output: 'Hello, World!'

2. Changing Case

Converting the case of a string to upper, lower, or title case.

data = "hello, world!"

upper_case = data.upper()
print(upper_case)  # Output: 'HELLO, WORLD!'

lower_case = data.lower()
print(lower_case)  # Output: 'hello, world!'

title_case = data.title()
print(title_case)  # Output: 'Hello, World!'

3. Replacing Substrings

Replacing occurrences of a substring with another substring.

data = "Hello, World!"

replaced_data = data.replace("World", "Python")
print(replaced_data)  # Output: 'Hello, Python!'

4. Splitting and Joining Strings

Splitting a string into a list of substrings and joining a list of strings into a single string.

data = "apple,banana,cherry"
split_data = data.split(",")
print(split_data)  # Output: ['apple', 'banana', 'cherry']

joined_data = "-".join(split_data)
print(joined_data)  # Output: 'apple-banana-cherry'

5. Checking String Content

Checking if a string starts with, ends with, or contains a substring.

data = "Hello, World!"

starts_with_hello = data.startswith("Hello")
print(starts_with_hello)  # Output: True

ends_with_world = data.endswith("World!")
print(ends_with_world)  # Output: True

contains_python = "Python" in data
print(contains_python)  # Output: False

6. Finding Substrings

Finding the position of a substring within a string.

data = "Hello, World!"

position = data.find("World")
print(position)  # Output: 7

# Finding all occurrences of a substring
indices = []
start = 0
while start < len(data):
    start = data.find("o", start)
    if start == -1:
        break
    indices.append(start)
    start += 1

print(indices)  # Output: [4, 8]

7. Removing Specific Characters

Removing specific characters from a string.

data = "Hello, World!"
cleaned_data = data.translate(str.maketrans('', '', '!,'))
print(cleaned_data)  # Output: 'Hello World'

8. String Formatting

Using formatted strings (f-strings) to insert variables into strings.

name = "Alice"
age = 30
formatted_string = f"My name is {name} and I am {age} years old."
print(formatted_string)  # Output: 'My name is Alice and I am 30 years old.'

9. Padding and Aligning Strings

Adding padding and aligning strings to a specific width.

data = "Hello"

left_padded = data.ljust(10)
print(f"'{left_padded}'")  # Output: 'Hello     '

right_padded = data.rjust(10)
print(f"'{right_padded}'")  # Output: '     Hello'

center_padded = data.center(10)
print(f"'{center_padded}'")  # Output: '  Hello   '

10. Checking for Alphanumeric Characters

Checking if a string contains only alphanumeric characters, digits, or alphabets.

data = "Hello123"

is_alphanumeric = data.isalnum()
print(is_alphanumeric)  # Output: True

is_digit = data.isdigit()
print(is_digit)  # Output: False

is_alpha = data.isalpha()
print(is_alpha)  # Output: False

11.String Concatenation in Python:

Python provides several ways to combine strings, variables, and numbers into a single string:

1. Using the + operator:

This is the most common and straightforward approach for basic concatenation.

name = "Alice"
age = 30
greeting = "Hello, " + name + "! You are " + str(age) + " years old."
print(greeting)  # Output: Hello, Alice! You are 30 years old.

Explanation:

  • The + operator is used to concatenate strings.
  • We convert the integer age to a string using str(age) before adding it to the string.

2. Using f-strings (Python 3.6+):

f-strings offer a cleaner and more readable way to embed variables within strings.

name = "Alice"
age = 30
greeting = f"Hello, {name}! You are {age} years old."
print(greeting)  # Output: Hello, Alice! You are 30 years old.

Explanation:

  • Curly braces {} are used to indicate places where variables should be inserted.
  • The variable names are directly referenced within the braces.

3. Using the .format() method:

While less common than f-strings, the .format() method provides more flexibility for complex formatting needs.

name = "Alice"
age = 30
greeting = "Hello, {}! You are {} years old.".format(name, age)
print(greeting)  # Output: Hello, Alice! You are 30 years old.

Explanation:

  • The .format() method is called on the base string.
  • Placeholders {} in the string are replaced with the provided arguments.

Use Case Example: Data Cleaning Pipeline

Here is an example of how you might use these functions together to clean a list of strings.

data_list = [
    "  Alice,30,Engineer  ",
    " Bob,25,Data Scientist  ",
    "Charlie, , Doctor",
    "  ,40,Lawyer",
    "David,35,   "
]

cleaned_data_list = []

for data in data_list:
    # Strip leading and trailing whitespace
    data = data.strip()
    
    # Split into components
    parts = data.split(',')
    
    # Clean each part
    cleaned_parts = [part.strip() for part in parts]
    
    # Replace empty strings with None
    cleaned_parts = [part if part else None for part in cleaned_parts]
    
    # Rejoin cleaned parts
    cleaned_data = ",".join([part if part else "" for part in cleaned_parts])
    
    cleaned_data_list.append(cleaned_data)

print("Cleaned Data List:")
for cleaned_data in cleaned_data_list:
    print(cleaned_data)

Good Examples

1.To find is a given string starts with a vowel.



def startsWithVowel(str1):
	if str1[0] in "aeiouAEIOU":
		return True
	return False

startsWithVowel("Apple")
True
startsWithVowel("banana")
False

2.How to check if words are anagram Show ?

Here are two effective ways to check if two words are anagrams in Python:

Method 1: Sorting

This approach sorts both words alphabetically and then compares them. If the sorted strings are equal, they are anagrams.

def is_anagram_sort(str1, str2):
  """
  Checks if two strings are anagrams using sorting

  Args:
      str1: First string
      str2: Second string

  Returns:
      True if anagrams, False otherwise
  """
  # Convert both strings to lowercase and remove whitespaces (optional)
  str1 = str1.lower().replace(" ", "")
  str2 = str2.lower().replace(" ", "")

  # Check if lengths are equal (anagrams must have the same number of characters)
  if len(str1) != len(str2):
    return False

  # Sort both strings
  sorted_str1 = sorted(str1)
  sorted_str2 = sorted(str2)

  # Compare the sorted strings
  return sorted_str1 == sorted_str2

# Example usage
str1 = "race"
str2 = "care"
if is_anagram_sort(str1, str2):
  print(str1, "and", str2, "are anagrams")
else:
  print(str1, "and", str2, "are not anagrams")

Explanation:

  1. The function is_anagram_sort takes two strings (str1 and str2) as input.
  2. It converts both strings to lowercase and removes whitespaces (optional) for case-insensitive and whitespace-insensitive comparison.
  3. It checks if the lengths of the strings are equal. If not, they cannot be anagrams.
  4. It sorts both strings using sorted(). Sorting rearranges the characters alphabetically.
  5. It compares the sorted strings to see if they are identical. If so, the original strings are anagrams.

Method 2: Counting Characters

This method uses a dictionary to count the occurrences of each character in both strings. If the resulting dictionaries are the same, the strings are anagrams.

from collections import Counter

def is_anagram_count(str1, str2):
  """
  Checks if two strings are anagrams using character counting

  Args:
      str1: First string
      str2: Second string

  Returns:
      True if anagrams, False otherwise
  """
  # Convert both strings to lowercase and remove whitespaces (optional)
  str1 = str1.lower().replace(" ", "")
  str2 = str2.lower().replace(" ", "")

  # Create dictionaries to count character occurrences
  char_count_str1 = Counter(str1)
  char_count_str2 = Counter(str2)

  # Compare the dictionaries
  return char_count_str1 == char_count_str2

# Example usage
str1 = "listen"
str2 = "silent"
if is_anagram_count(str1, str2):
  print(str1, "and", str2, "are anagrams")
else:
  print(str1, "and", str2, "are not anagrams")

Explanation:

  1. The function is_anagram_count takes two strings (str1 and str2) as input.
  2. It converts both strings to lowercase and removes whitespaces (optional) for case-insensitive and whitespace-insensitive comparison.
  3. It uses Counter from the collections library to create dictionaries for each string. Each key in the dictionary represents a character, and the value represents the number of times that character appears in the string.
  4. It compares the two dictionaries using the equality operator (==). If the dictionaries have the same keys (characters) with the same corresponding values (occurrences), the strings are anagrams.

3.To check if a word has double consecutive letters.



def hasDouble(str1):
    str1 = str1.lower()
    flag = False
    for i in str1[0:len(str1)-1]:
        index = str1.index(i)
        if i==str1[index+1]:
            flag = True
    return flag 

 hasDouble("Google")
True
 hasDouble("Microsoft")
False

Really Good Examples

1.How to check if a string is palindrome in python

:-

You can check if a string is a palindrome in Python by comparing the string with its reverse. If the string is the same when reversed, it’s a palindrome. Here’s a simple way to do it:

def is_palindrome(s):
# Remove spaces and convert to lowercase for case-insensitive comparison
s = s.replace(" ", "").lower()
# Compare the string with its reverse
return s == s[::-1]

# Test the function
print(is_palindrome("radar")) # Output: True
print(is_palindrome("hello")) # Output: False

This function is_palindrome() takes a string s as input, removes spaces and converts it to lowercase for case-insensitive comparison. Then, it compares the original string s with its reverse using slicing s[::-1]. If they are equal, the function returns True, indicating that the string is a palindrome; otherwise, it returns False.

OneLiner

s=’wew’

s==s.replace(” “,””).lower()[::-1]

2. Transform to pig Latin

Pig Latin: simple text transformation that modifies each word moving the first character to the end and appending “ay” to the end.

You can create a Python function to convert text into Pig Latin by following these steps:

  1. Split the text into words.
  2. For each word:
    • Move the first character to the end of the word.
    • Append “ay” to the end of the word.
  3. Join the modified words back into a single string.

Here’s the implementation of the function:

def pig_latin(text):
# Split the text into words
words = text.split()
# List to store Pig Latin words
pig_latin_words = []
# Iterate over each word
for word in words:
# Move the first character to the end and append "ay"
pig_latin_word = word[1:] + word[0] + "ay"
# Append the modified word to the list
pig_latin_words.append(pig_latin_word)
# Join the Pig Latin words back into a single string
pig_latin_text = " ".join(pig_latin_words)
return pig_latin_text

# Test the function
text = "hello world"
print(pig_latin(text)) # Output: "ellohay orldway"

This function splits the input text into words, processes each word to convert it into Pig Latin, and then joins the modified words back into a single string. You can test it with different input texts to see how it transforms them into Pig Latin.

3.To calculate prefix and suffix scores for string comparisons. The prefix score is the length of the longest common prefix, and the suffix score is the length of the longest common suffix between two strings.

def prefix_score(s1, s2):
    score = 0
    for i in range(min(len(s1), len(s2))):
        if s1[i] == s2[i]:
            score += 1
        else:
            break
    return score

def suffix_score(s1, s2):
    score = 0
    for i in range(1, min(len(s1), len(s2)) + 1):
        if s1[-i] == s2[-i]:
            score += 1
        else:
            break
    return score

# Example usage
s1 = 'ram'
s2 = 'rafgert'
s3 = 'genam'

print("Prefix score of s1 and s2:", prefix_score(s1, s2))  # Output: 2
print("Suffix score of s1 and s3:", suffix_score(s1, s3))  # Output: 2

4.Longest Common Subsequence (LCS)

The longest common subsequence problem is to find the longest subsequence common to all sequences in a set of sequences (often just two sequences). A subsequence is a sequence that appears in the same relative order, but not necessarily consecutively.

Problem Description

Given two strings, write a Python function to find the length of their longest common subsequence. For example, for the strings s1 = "ABCBDAB" and s2 = "BDCAB", the longest common subsequence is "BDAB" or "BCAB", and its length is 4.

Solution

We’ll use dynamic programming to solve this problem efficiently.

def lcs_length(s1, s2):
    m, n = len(s1), len(s2)
    # Create a 2D array to store lengths of longest common subsequence.
    dp = [[0] * (n + 1) for _ in range(m + 1)]

    # Build the dp array from the bottom up.
    for i in range(1, m + 1):
        for j in range(1, n + 1):
            if s1[i - 1] == s2[j - 1]:
                dp[i][j] = dp[i - 1][j - 1] + 1
            else:
                dp[i][j] = max(dp[i - 1][j], dp[i][j - 1])

    # dp[m][n] contains the length of LCS for s1 and s2.
    return dp[m][n]

# Example usage
s1 = "ABCBDAB"
s2 = "BDCAB"
print("Length of LCS:", lcs_length(s1, s2))  # Output: 4
def lcs_dp(X, Y):
  """
  Finds the length and sequence of LCS using dynamic programming

  Args:
      X: First string
      Y: Second string

  Returns:
      Length of the LCS and the LCS sequence
  """
  m = len(X)
  n = len(Y)

  # Create a table to store LCS lengths
  lcs_table = [[0 for _ in range(n + 1)] for _ in range(m + 1)]

  # Fill the table using dynamic programming
  for i in range(m + 1):
    for j in range(n + 1):
      if i == 0 or j == 0:
        lcs_table[i][j] = 0
      elif X[i-1] == Y[j-1]:
        lcs_table[i][j] = lcs_table[i-1][j-1] + 1
      else:
        lcs_table[i][j] = max(lcs_table[i-1][j], lcs_table[i][j-1])

  # Backtrack to find the LCS sequence
  lcs = ""
  i = m
  j = n
  while i > 0 and j > 0:
    if X[i-1] == Y[j-1]:
      lcs = X[i-1] + lcs
      i -= 1
      j -= 1
    else:
      if lcs_table[i-1][j] > lcs_table[i][j-1]:
        i -= 1
      else:
        j -= 1

  # Return the length and the LCS sequence (reversed)
  return lcs_table[m][n], lcs[::-1]

# Example usage
X = "AGGTAB"
Y = "GXTXAYB"
lcs_length, lcs_sequence = lcs_dp(X, Y)
print("Length of LCS:", lcs_length)
print("LCS sequence:", lcs_sequence)

Explanation

  1. Initialization: We initialize a 2D list dp of size (m+1) x (n+1) where m and n are the lengths of the input strings. Each entry dp[i][j] will store the length of the LCS of the substrings s1[0..i-1] and s2[0..j-1].
  2. Filling the DP Table: We iterate over each character of both strings. If the characters match, we add 1 to the value of the diagonal cell (dp[i-1][j-1]). If they don’t match, we take the maximum value from the cell above (dp[i-1][j]) or the cell to the left (dp[i][j-1]).
  3. Result: The value at dp[m][n] will contain the length of the LCS of the two input strings.

This approach ensures we compute the LCS length efficiently with a time complexity of O(m*n) and a space complexity of O(m*n).

Recursive Approach:

This method breaks down the problem into smaller subproblems and builds the solution from the bottom up. Here’s the implementation:

def lcs_recursive(X, Y, m, n):
  """
  Finds the length of LCS using recursion

  Args:
      X: First string
      Y: Second string
      m: Length of first string
      n: Length of second string

  Returns:
      Length of the LCS
  """
  if m == 0 or n == 0:
    return 0
  if X[m-1] == Y[n-1]:
    return 1 + lcs_recursive(X, Y, m-1, n-1)
  else:
    return max(lcs_recursive(X, Y, m-1, n), lcs_recursive(X, Y, m, n-1))

# Example usage
X = "AGGTAB"
Y = "GXTXAYB"
m = len(X)
n = len(Y)
lcs_length = lcs_recursive(X, Y, m, n)
print("Length of LCS:", lcs_length)

Explanation:

  • The function lcs_recursive takes four arguments:
    • X: First string
    • Y: Second string
    • m: Length of X
    • n: Length of Y
  • The base case checks if either string is empty. If so, the LCS length is 0.
  • If the last characters of both strings match, the LCS length is 1 plus the LCS length of the shorter strings (X[0:m-1] and Y[0:n-1]).
  • If the last characters don’t match, the LCS length is the maximum of the LCS lengths of considering either string one character shorter (excluding the last character).

5. Longest Palindromic Substring

Given a string, find the longest substring which is a palindrome. For example, given the string “babad”, the longest palindromic substring is “bab” (or “aba”).

def longest_palindromic_substring(s):
    n = len(s)
    if n == 0:
        return ""
    
    # Table to store lengths of palindromic substrings
    dp = [[False] * n for _ in range(n)]
    
    start = 0
    max_length = 1
    
    # All substrings of length 1 are palindromic
    for i in range(n):
        dp[i][i] = True
    
    # Check for substrings of length 2
    for i in range(n - 1):
        if s[i] == s[i + 1]:
            dp[i][i + 1] = True
            start = i
            max_length = 2
    
    # Check for lengths greater than 2
    for length in range(3, n + 1):
        for i in range(n - length + 1):
            j = i + length - 1
            
            # Checking for palindromic substring
            if s[i] == s[j] and dp[i + 1][j - 1]:
                dp[i][j] = True
                if length > max_length:
                    start = i
                    max_length = length
    
    return s[start:start + max_length]

# Example usage
s = "babad"
print("Longest Palindromic Substring:", longest_palindromic_substring(s))  # Output: "bab" or "aba"

6. Regular Expression Matching

Given an input string s and a pattern p, implement regular expression matching with support for '.' and '*'. '.' matches any single character, and '*' matches zero or more of the preceding element.

def is_match(s, p):
    dp = [[False] * (len(p) + 1) for _ in range(len(s) + 1)]
    dp[0][0] = True

    for i in range(1, len(p) + 1):
        if p[i - 1] == '*':
            dp[0][i] = dp[0][i - 2]

    for i in range(1, len(s) + 1):
        for j in range(1, len(p) + 1):
            if p[j - 1] == '.' or p[j - 1] == s[i - 1]:
                dp[i][j] = dp[i - 1][j - 1]
            elif p[j - 1] == '*':
                dp[i][j] = dp[i][j - 2]
                if p[j - 2] == '.' or p[j - 2] == s[i - 1]:
                    dp[i][j] = dp[i][j] or dp[i - 1][j]
    return dp[-1][-1]

# Example usage
s = "aab"
p = "c*a*b"
print("Regular Expression Match:", is_match(s, p))  # Output: True

7. Minimum Window Substring

Given two strings s and t, return the minimum window in s which will contain all the characters in t. If there is no such window in s that covers all characters in t, return the empty string "".

from collections import Counter

def min_window(s, t):
    if not t or not s:
        return ""
    
    dict_t = Counter(t)
    required = len(dict_t)
    
    l, r = 0, 0
    formed = 0
    window_counts = {}
    
    ans = float("inf"), None, None
    
    while r < len(s):
        character = s[r]
        window_counts[character] = window_counts.get(character, 0) + 1
        
        if character in dict_t and window_counts[character] == dict_t[character]:
            formed += 1
        
        while l <= r and formed == required:
            character = s[l]
            
            if r - l + 1 < ans[0]:
                ans = (r - l + 1, l, r)
            
            window_counts[character] -= 1
            if character in dict_t and window_counts[character] < dict_t[character]:
                formed -= 1
            
            l += 1
        
        r += 1
    
    return "" if ans[0] == float("inf") else s[ans[1]:ans[2] + 1]

# Example usage
s = "ADOBECODEBANC"
t = "ABC"
print("Minimum Window Substring:", min_window(s, t))  # Output: "BANC"

8.Balanced Parentheses:

1. Balanced Parentheses:

  • Problem: Given a string of parentheses ((), {}, []), determine if the parentheses are balanced. A string is balanced if each opening parenthesis has a corresponding closing parenthesis of the same type and in the correct order.
  • Example:
    • Input: "{[]}" (balanced)
    • Input: "([)]" (unbalanced)
  • Challenge: Solve this efficiently using a stack or recursion.

Solution (Stack Approach):

def is_balanced(expression):
  """
  Checks if the parentheses in a string are balanced

  Args:
      expression: String containing parentheses

  Returns:
      True if balanced, False otherwise
  """
  mapping = {"(": ")", "{": "}", "[": "]"}  # Mapping for opening and closing parentheses
  stack = []
  for char in expression:
    if char in mapping:  # If it's an opening parenthesis, push it onto the stack
      stack.append(char)
    else:  # If it's a closing parenthesis
      if not stack or mapping[stack.pop()] != char:  # Check if it matches the top of the stack
        return False
  return not stack  # If the stack is empty at the end, all parentheses were balanced

# Example usage
expression = "{[]}"
if is_balanced(expression):
  print("Balanced parentheses")
else:
  print("Unbalanced parentheses")

9. Group Shifted Strings :

  • Problem: Given an array of strings, group all strings where shifting each letter to the left by one position results in strings in the group.
  • Example:
    • Input: ["abc", "bcd", "abcde", "bcdx"]
    • Output: [["abc", "bcd"], ["abcde"]] (Explanation: “bcd” is one shift to the left of “abc”, and “bcdx” doesn’t follow the pattern)
  • Challenge: Solve this efficiently in time and space complexity. Consider using a hash table or rolling hash function.

Solution (Hash Table Approach):

Python

from collections import defaultdict  # Use defaultdict for efficient key creation

def group_shifted_strings(strs):
  """
  Groups strings where shifting letters to the left by one position results in strings in the group

  Args:
      strs: Array of strings

  Returns:
      List of lists, where each inner list contains grouped strings
  """
  groups = defaultdict(list)
  for word in strs:
    # Create a key by shifting each character and constructing a new string
    key = ''.join([chr((ord(char) - ord('a') + 1) % 26 + ord('a')) for char in word])
    groups[key].append(word)
  return list(groups.values())

# Example usage
strs = ["abc", "bcd", "abcde", "bcdx"]
groups = group_shifted_strings(strs)
print("Grouped strings:", groups)

10. Reverse Words in a String (with Spaces):

  • Problem: Given a string, reverse the words in place without using any temporary data structures (like a second string).
  • Example:
    • Input: “This is a string”
    • Output: “string a is This”
  • Challenge: Solve this in-place with a two-pointer approach.

Solution (Two-Pointer Approach):

def reverse_words(s):
  """
  Reverses the words in a string in-place

  Args:
      s: String to be reversed
  """
  s = list(s)  # Convert string to a list for in-place modification
  n = len(s)

  # Reverse the entire string first
  i, j = 0, n - 1
  while i < j:
    s[i], s[j] = s[j], s[i]
    i += 1
    j -= 1

  # Reverse individual words within the reversed string
  start = 0
  for i in range(n):
    if s[i] == " ":
      end = i
      # Reverse the word from start to end-1 (excluding the space)
      j = start
      k = end - 1
      while j < k:
        s[j], s[k] = s[k], s

String functions for revision in Python: Let us Revise what we did learn!!

Essential String Functions in Python:

  1. len(string): Returns the length of the string (number of characters).
    • Example: length = len("Hello, World!") # length will be 13
  2. string.upper(): Converts all lowercase letters in the string to uppercase.
    • Example: uppercase_text = "hello".upper() # uppercase_text will be "HELLO"
  3. string.lower(): Converts all uppercase letters in the string to lowercase.
    • Example: lowercase_text = "HELLO".lower() # lowercase_text will be "hello"
  4. string.split(sep, maxsplit=None): Splits the string into a list of substrings based on the specified separator (sep). The optional maxsplit parameter limits the number of splits to occur.
    • Example: words = "apple,banana,cherry".split(",") # words will be ["apple", "banana", "cherry"]
  5. string.join(iterable): Joins elements from an iterable (e.g., list) into a single string using the specified separator.
    • Example: joined_string = "-".join(["apple", "banana", "cherry"]) # joined_string will be "apple-banana-cherry"
  6. string.strip(chars=None): Removes leading and trailing characters from the string. Optionally, you can specify characters to remove (chars).
    • Example: clean_text = " Extra spaces ".strip() # clean_text will be "Extra spaces"
  7. string.replace(old, new, count=None): Replaces occurrences of a substring (old) with another substring (new). The optional count parameter limits the number of replacements.
    • Example: fixed_text = "Mississippi".replace("ss", "s", 1) # fixed_text will be "Misispippi"
  8. string.startswith(prefix, start=0, end=None): Checks if the string starts with the specified prefix within a given range.
    • Example: does_start = "Hello, World!".startswith("Hello") # does_start will be True
  9. string.endswith(suffix, start=None, end=endobj): Checks if the string ends with the specified suffix within a given range.
    • Example: does_end = "Hello, World!".endswith("World!") # does_end will be True
  10. string.find(sub, start=0, end=None): Returns the index of the first occurrence of the substring (sub) within a given range, or -1 if not found.
  • Example: first_index = "Hello, World!".find("W") # first_index will be 7
  1. string.rfind(sub, start=0, end=None): Returns the index of the last occurrence of the substring (sub) within a given range, or -1 if not found.
  • Example: last_index = "Hello, World! World!".rfind("World") # last_index will be 17
  1. string.isalpha(): Checks if all characters in the string are alphabetic (letters a-z and A-Z).
  • Example: is_alpha = "hello123".isalpha() # is_alpha will be False
  1. string.isdigit(): Checks if all characters in the string are digits (0-9).
  • Example: is_digit = "12345".isdigit() # is_digit will be True
  1. string.isalnum(): Checks if all characters in the string are alphanumeric (letters and digits).
  • Example: is_alnum = "hello123".isalnum() # is_alnum will be True
  1. string.isspace(): Checks if all characters in the string are whitespace characters (spaces, tabs, newlines, etc.).
  • Example: is_space = " tn".isspace() # is_space will be True
  1. string.istitle(): Checks if the string is a titlecased string (first letter of each word uppercase, others lowercase).
  • Example: `is_title

Discover more from AI HintsToday

Subscribe to get the latest posts sent to your email.

Newsletter

Our latest updates in your e-mail.

Comments


Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Discover more from AI HitsToday

Subscribe now to keep reading and get access to the full archive.

Continue reading