Python, Read a text file and print out a list of invalid user names

+1 vote
asked Nov 7, 2019 by John Covey (130 points)
I need to read the linux /var/log/auth.log file and print the user of unsuccessful logins to the system. Some lines of the file will look like this:
```
May 26 07:29:20 instance-1 sshd[20327]: Disconnected from 61.147.247.146 port 45177 [preauth]
May 26 07:32:22 instance-1 sshd[20351]: Invalid user nagios from 159.65.144.233 port 49715
May 26 07:32:22 instance-1 sshd[20351]: input_userauth_request: invalid user nagios [preauth]
May 26 07:32:23 instance-1 sshd[20351]: Received disconnect from 159.65.144.233 port 49715:11: Normal
May 26 07:32:22 instance-1 sshd[20351]: Invalid user admin from 159.65.144.233 port 49715
```
I need to read the lines of data from this file, and print out a sorted list to a file of invalid user names, one per line, like the list shown below.

```
nagios
admin
```
I have got this far but I need it to print out the word AFTER the string.

        errors = []                   # The list where we will store results.
        linenum = 0
        substr = "Invalid user"          # Substring to search for.
        with open ('auth.log', 'rt') as myfile:
            for line in myfile:
            linenum += 1
            if line.find(substr) != -1:
                errors.append(line.rstrip('\n'))
        for line in errors:
            print(line)
    

Thanks

2 Answers

0 votes
answered Nov 8, 2019 by gameforcer (2,950 points)

You can use list.index() method to choose range of a substring as so:

for i in errors:
    print(i[i.index('user')+5:i.index('from')])

"+5" is to start after "user" string and one extra space. Similarly you end the substring before "from".

0 votes
answered Feb 5 by Nando Abreu (930 points)

destinationFile = "/tmp/logins.csv"
sourceFile = "/var/log/auth.log"
criteria = "Invalid user"

matches = []
with open(sourceFile) as src:
    import re
    for line in src.read().split("\n"):
        if not criteria in line: continue
        login = re.sub(".*" + criteria + " (\w+).*", r"\1", line)
        matches.append(login)

matches = sorted(list(set(matches)))
count = len(matches)

if count == 0:
    import sys
    print("No lines having '{}' in {}. Abort.".format(criteria, sourceFile))
    exit(1)
else:
    with open(destinationFile, "w+") as dst:
        dst.write("\n".join(matches))
        print("{} login(s) written to {}.".format(count, destinationFile))

#M# file open/read/parse/write, regex/regular expression, regex group capture/replacement, list sort, list clean/remove duplicates, print list in file, system error

Welcome to OnlineGDB Q&A, where you can ask questions related to programming and OnlineGDB IDE and and receive answers from other members of the community.
...