95.21 Web Site Analyser
20210707
Use webalizer as a command line tool to analyse the apache2 access logs.
After installing webalizer create a folder into which the results will be saved (as HTML), using mkdir, and then cd (change directory) into the folder.
Next, collect together all of the apache logs you can access using
zcat for compressed files and
cat for text files, redirecting the output to the
file tmp.log
.
zcat /var/log/apache2/access*.gz > tmp.log
cat /var/log/apache2/access.log.1 /var/log/apache2/access.log >> tmp.log
We sort the lines in tmp.log
, saving to
access.log
, so that we don’t confuse the
webalizer. This command line comes from
https://stackoverflow.com/questions/5672733/how-can-i-sort-an-apache-log-file-by-date.
sort -u -t ' ' -k 4.9,4.12n -k 4.5,4.7M -k 4.2,4.3n -k 4.14,4.15n -k 4.17,4.18n -k 4.20,4.21n tmp.log > access.log
Now run the webalizer over the sorted access.log
file targeting the output, using -o
, to be the current working
directory (.
):
Finally, visit that directory in a browser:
Your donation will support ongoing availability and give you access to the PDF version of this book. Desktop Survival Guides include Data Science, GNU/Linux, and MLHub. Books available on Amazon include Data Mining with Rattle and Essentials of Data Science. Popular open source software includes rattle, wajig, and mlhub. Hosted by Togaware, a pioneer of free and open source software since 1984. Copyright © 1995-2022 Graham.Williams@togaware.com Creative Commons Attribution-ShareAlike 4.0