Introduction

Textual data in Python is handled with str objects, or strings. Strings are immutable sequences of Unicode code points. This post discuss various built-in string methods which are used for operating on string.

Checking Start and End

  • str.startswith(prefix[, start[, end]]) : Return True if string starts with the prefix, otherwise return False. Optional start, start comparing string at that position. Optional end, stop comparing string at that position.
  • str.endswith(suffix[, start[, end]]) : Return True if the string ends with the specified suffix, otherwise return False. Optional start, start comparing string at that position. Optional end, stop comparing string at that position.
  • str.strip([chars]) : Return a copy of the string with the leading and trailing characters removed. The chars argument is a string specifying the set of characters to be removed. If omitted, the chars argument defaults to removing whitespace.
  • str.rstrip([chars]) : Return a copy of the string with trailing characters removed. The chars argument is a string specifying the set of characters to be removed. If omitted, chars argument defaults to removing whitespace.
str1 = "The quick brown fox jumps over the lazy dog"

# startsWith and endsWith functions
print(str1.startswith("The"))          # True
print(str1.startswith("quick", 4))     # True
print(str1.endswith("dog"))            # True

# Output: quick brown fox jumps over the lazy
print(str1.strip('dogThe'))

Finding Substring

  • str.find(sub[, start[, end]]) : Return the lowest index in the string where substring sub is found within the slice s[start:end]. start and end are optional arguments interpreted as in slice notation. Return -1 if sub is not found.
  • str.rfind(sub[, start[, end]]) : Return the highest index in the string where substring sub is found, such that sub is contained within s[start:end]. start and end are optional arguments interpreted as in slice notation. Return -1 if sub is not found.
  • str.count(sub[, start[, end]]) : Return the number of non-overlapping occurrences of substring sub in the range [start, end]. Optional arguments start and end are interpreted as in slice notation.
str1 = "The quick brown fox jumps over the lazy dog"

# Finding Substring
print(str1.find("fox"))    # 16
print(str1.rfind("dog"))   # 40
print(str1.count("the"))   # 1

Converting Case

  • str.capitalize() : Return a copy of the string with its first character capitalized and the rest lowercased.
  • str.lower() : Return a copy of the string with all the cased characters converted to lowercase.
  • str.upper() : Return a copy of the string with all the cased characters converted to uppercase.
str1 = "The quick brown fox jumps over the lazy dog"

# Converting Case
print("quick brown fox".capitalize()) # Quick brown fox
print("quick brown fox".lower())      # quick brown fox
print("quick brown fox".upper())      # QUICK BROWN FOX

Manipulating String

  • str.join(iterable) : Return a string which is the concatenation of the strings in iterable. The separator between elements is the string providing this method.
  • str.split(sep=None, maxsplit=-1) : Return a list of the words in the string, using sep as the delimiter string. If maxsplit is given, at most maxsplit splits are done (maxsplit+1 elements).
  • str.rsplit(sep=None, maxsplit=-1) : Except for splitting from the right, rsplit() behaves like split() which is described above.
  • str.replace(old, new[, count]) : Return a copy of the string with all occurrences of substring old replaced by new. If the optional argument count is given, only the first count occurrences are replaced.
  • str.splitlines([keepends]) : Return a list of the lines in the string, breaking at line boundaries.
# Example showing use of standard library functions

str1 = "The quick brown fox jumps over the lazy dog"

# Manipulating String
letters = ["a", "e", "i", "o", "u"]
print("-".join(letters))    # Output: a-e-i-o-u

letters = {"A", "E", "I", "O", "U"}
print("*".join(letters))    # Output: O*A*U*E*I

# Use the split and join functions
# Output: ['The', 'quick', 'brown', 'fox', 'jumps', 'over', 'the', 'lazy', 'dog']
print(str1.split(" "))

# using replace
# Output: The lazy brown fox jumps over the lazy dog
print(str1.replace("quick", "lazy"))

# Using splitlines
lines = 'ae i\n\nou\n'

# Output: ['ae i', '', 'ou']
print(lines.splitlines())

# Output: ['ae i\n', '\n', 'ou\n']
print(lines.splitlines(keepends=True))

Formatting String

  • str.ljust(width[, fillchar]) : Return the string left justified in a string of length width. Padding is done using the specified fillchar. The original string is returned if width is less than or equal to len(s).
  • str.rjust(width[, fillchar]) : Return the string right justified in a string of length width. Padding is done using the specified fillchar. The original string is returned if width is less than or equal to len(s).
  • str.center(width[, fillchar]): Return centered in a string of length width. Padding is done using the specified fillchar. The original string is returned if width is less than or equal to len(s).
names = ["Python", "Java"]
alignSize = 8

'''Output:
Python--:
Java----:
'''
for name in names:
    print(name.ljust(alignSize, "-") + ":")

'''Output:
-Python-:
--Java--:
'''
for name in names:
    print(name.center(alignSize, "-") + ":")

'''Output:
--Python:
----Java:
'''
for name in names:
    print(name.rjust(alignSize, "-") + ":")

Reference

Built-in Types