01 logo

Python Program That Counts How Many Times A Character Appears In A File

Lazy dictionary and my thought process from a while back

By James SsekamattePublished 2 years ago 4 min read
1
Python Program That Counts How Many Times A Character Appears In A File
Photo by Hitesh Choudhary on Unsplash

The role of this code was to take a given text file containing full sentences with all forms of characters and iterate through the file to find the number of times any given character appears in the file while ignoring those characters that we aren’t interested in.

My requirement was to also make the code case insensitive but I just felt that I had to code my way around that dynamism although am sure there is a simple function that can take care of that.

Anyway…

It is a python file, and I also used vim as it is my favorite editor although it's a pain sometimes. Making the file and I decided to call it “refile”…

vim refile.py

I then had to import the CSV library that would allow me to perform the file manipulations I needed.

import csv

The main function that I am supposed to call at the end of the program for me to have output is…. yeah, main. So I had to define that.

def main():

I then decided to set up some form of a lazy dictionary. This would contain the key-value pairs in which the keys would be the letters I am keeping track of throughout the file while the values were the number of times those characters are showing up.

I called that dictionary “letter_frequency”

letter_frequency = {

“N”: 0,

“I”: 0,

“V”: 0,

“C”: 0,

“K”: 0,

“T”: 0,

“A”: 0

}

The letters above were just random samples and putting even a non-alphabetical character there should work just fine. I however made them all capital as part of the process that I was deploying to make sure that the program was “case insensitive”

I now open the file as a read and write (r+). The purpose of this was because I was going to first change all the characters to uppercase before beginning my tallying process. This would help me get rid of the case issue and in a way, make the file case insensitive.

To do this, I use the “with” command to open the file without having to write another line of code that closes it.

with open (“E:\documents\points_copy1.txt”, “r+”) as file:

The next two lines of code have a purpose too.

The first line will read the lines within the file and assign them to a variable called “file_content”. The next line acts as a pointer which will reset the pointer to the start of the next line so that the line can also be read.

file_content = file.readlines()

file.seek(0)

After I will loop through the lines that I stored in “file_content”, make them all uppercase, and then store them in a variable called upper_file. Then I will overwrite the contents that are within my file with the contents in “upper_file”, an action that will convert all the letters in the file to uppercase.

for line in file_content:

upper_file = line.upper()

file.write(upper_file)

Now the file should contain all uppercase letters.

I now open it again with the same command as earlier. You may ask why I need to re-open it up the code was just acting strange without this so I decided to re-open it.

with open (“E:\documents\points_copy1.txt”, “r+”) as file:

Here is where we are going to go through the file, characters one at a time, and then check out the lazy dictionary to see if that character is there. If it is part of the dictionary, we shall add the one to the count of that character. If we are on a character that is not in the dictionary for instance a number, or space, we shall just ignore it.

Before all that fun, however, we want to first read the file. I think that the first line of this code block might not be important though. We have already covered what the next two lines of code do.

reader = csv.reader(file) #it reads the file but may not be useful

line = file.readlines()

file.seek(0)

In this next part, we are going to nest for-loops so that we can run loops within loops. This is because we are first going to loop within lines of the file and while we are in any given line, we shall be looping through the characters of that line and then running an if-condition with that last loop.

Backing up for a second, let us first create a for-loop to loop through the lines. How will it know when it is at the end of the line? we use the “len()” function for that. While looping through the lines, I will be storing those lines in a variable called line_char.

for i in range(len(line))

line_char = line[i]

I will then start a for-loop at the same indent as the line_char variable. This loop will go through each character within the line. While it is doing that, it will assign that character to a temporary variable called “letter”.

for c in line_char:

letter = c[0]

At the same indent as the letter variable, we shall then run an if-condition that will check whether the character in the letter variable exists in our lazy dictionary (letter_frequency). if it does, it will add one to the value of that character otherwise it will just ignore it and move on to the next character.

if letter in letter_frequency:

letter_frequency[letter]+=1

else:

continue

Last but not least, we shall come out of the nest and make another for loop that will go through our lazy dictionary and print the final results.

for letter in letter_frequency:

print(f"{letter}: {letter_frequency[letter]}")

We shall end the program by calling this main function so that the program can run when we type the “python refile.py ”command.

main()

how to
1

About the Creator

James Ssekamatte

Engineer and artist sharing my perpective with the world.

Reader insights

Be the first to share your insights about this piece.

How does it work?

Add your insights

Comments

There are no comments for this story

Be the first to respond and start the conversation.

Sign in to comment

    Find us on social media

    Miscellaneous links

    • Explore
    • Contact
    • Privacy Policy
    • Terms of Use
    • Support

    © 2024 Creatd, Inc. All Rights Reserved.