I am teaching assistant for CS 166 — Principles of Cybersecurity — fall semester 2019, under Prof J. Eddy.
My office hours for CS 166 are
- Monday 1:00 PM to 2:00 PM
- Tuesday 9:00 AM to 10:00 AM
or by appointment. My office is E332 Innovation Hall, but office hours will be in E338, the common area immediately outside E332.
Please feel free to email with questions to clayton dot cafiero at (you know the rest)!
Topics - This course builds a strong foundation in the principles of cybersecurity. Topics include an introduction to cybersecurity, fundamental security design principles, programming flaws, malicious code, web and database security; as well as common cryptography algorithms and hashing functions. The course concludes with an overview of computer networks and common network threat vectors.
How to read a specification
Here’s what Prof Eddy wrote.
1) The program should display a welcome message and prompt the user for a username. Create a simulated buffer overflow condition by allowing a user to input more data than the size of the allocated memory (causing the program to crash).
2) Implement input validation to mitigate the simulated overflow vulnerability. Check that the username entered has a minimum length of 8 characters and a maximum of 25 characters. If the user enters a username length outside those limits, return an error message and prompt the user to re-enter the username.
Also, recall that Prof Eddy supplied a video accompanying the spec, which gave a clear example of one way to simulate the error.
- Your program should display a welcome message and prompt for a username. Simple.
- There’s no mention of password or any other input. Just username. No need to prompt for multiple usernames, passwords, mother’s maiden name, last four digits of your SSN, color of the sky, or anything else.
- If input exceeds size of allocated memory, simulate a buffer overflow error! (I know, this is a little kludgey, but it is what it is, and we should all understand it is to illustrate a point.) There are many ways to do this; the video shows you a very simple one. Some of you got creative (in a positive way) and that’s OK; others didn’t simulate an error at all (insert sad-face emoji here).
- The specification isn’t as explicit as it might be in this case, but for best comparison between vulnerable and fixed code, generally it’s best to keep as many things constant as possible. So since you were asked to permit usernames of length 8-25 characters, this should have (implicitly) been a constraint for program #1, with usernames with fewer than eight characters rejected normally, and usernames with more than 25 characters generating an error.
- Ideally, the preallocated structure that should have been used in program #1 should have been left in place, and input validation should have ensured that nothing bigger than that structure would be accepted as input. (I did not deduct points if that preallocated structure was removed in program #2.)
In summary, please read assignment specifications carefully, and think how best to demonstrate that you did indeed fix something that was broken! While most folks did OK, some folks didn’t satisfy basic requirements, and some folks made extra work for themselves.
As we gain more experience, I will be a little more strict about how assignments are graded viz. with regard to specification.
I work (now part-time) for a company that has designed and built an equity research publishing and distribution platform used by numerous equity research firms (usually smaller “boutique” shops of between 40 and 200 analysts). This platform is web-based, and we provide customized environments for each client. Our clients sell their research, commonly on a subscription model, to hedge funds, equity funds, and institutional investors. Security is of paramount importance. Bad actors might seek to access research for which they have not paid, and then trade on this information.
On Tuesday, 17 September, one of our clients was subject to a few dozen attempts at SQL injection attack. The attacks failed because the application is hardened against such attacks, but our monitoring system detected the attempts and created alerts. Here’s a screenshot from our monitoring system:
You can see in the screenshot that the perpetrator tried something very similar to what’s explained in this course:
'"2 AND "x"="x'
The idea behind the attack is that the string ‘“2 AND “x”=”x’ is appended to the SQL query, where a search term might be. The attacker is trying to take a query like
SELECT * FROM table_foo WHERE record_id = 2
and get it to execute something like this:
SELECT * FROM table_foo WHERE record_id = 2 AND "x"="x"
Since the last clause always evaluates to True, such a query might return all records.
We harden our web applications in many ways so this kind of attack won’t ever work.
In lab you will harden a demo application by “sanitizing” input from the user — making sure that everything you use to build a SQL query is safe.
There were other attempts as well, all using the same basic idea. See if you can understand what’s going on in each. Imagine these strings appended to a query template ending in “WHERE some_column =”.
'2 AND 1=1' '299999" union select unhex(hex(version())) -- "x"="x' "2' or (1,2)=(select*from(select name_const(CHAR(111,108,111,108,111,115,104,101,114),1), name_const(CHAR(111,108,111,108,111,115,104,101,114),1))a) -- 'x'='x"
Anyhow, it was a simple matter to track the attacks to their origin. This turned out to be from a managed hosting service in Singapore. The offending IP address was reported and blocked.
Please feel free to come by during office hours if you find this interesting, or visit the OWASP website’s page on SQL injection.
And here’s a little something on-topic from Randall Munroe’s XKCD:
XKCD content licensed under a Creative Commons Attribution-NonCommercial 2.5 License
Most of you did a pretty good job with the functional requirements of the application. There were more than a few points taken off for not following the assignment instructions. I implore you: please read the instructions carefully and if you have questions, see me, Prof Eddy, or one of the other TAs. This is a silly way to lose points.
Some other general feedback:
While many used the
with openidiom to open your data file, many did not. This idiom is pretty standard and I encourage its use.
Some of you elected not to use the CSV module – which is part of the Python standard library. That’s OK if that’s your decision, but I encourage you to use standard library modules where appropriate. It’s good practice, it helps you write cleaner code, and you don’t have to reinvent the wheel. (N.B. I refer only to standard library modules and not third-party modules.) The standard library is substantial (even after many years programming in Python I couldn’t claim to be familiar with every module.) The point is: there’s a lot out there at your disposal. Use it!
Apropos of the above, if you find yourself writing a parser then you’ve missed an opportunity (unless, of course, your task is to write a parser).
If you find yourself repeating blocks of code, or blocks of very similar code, chances are near 100% that you’ve made an error in design. Take a step back and assess. See: Code Smells on Wikipedia and notice that duplicated code is the first item on the list for application-level smells!
Generally, I frown on mixing file I/O and logic, though I understand your reasons for doing so in this assignment and did not deduct points. Even in an environment where you’re using a database and not the file system for storing your data, it’s good practice to avoid having to go back to the well any more than is necessary.
Give thought to appropriate data structures for your application. This can save you much heartache and frustration.
Always provide instructions for testing your code in a separate file.
Finally, since this is a course on cybersecurity, be watchful. Some of you wrote code that introduced vulnerabilities in your application. Always be on the lookout.
Coding Style - Rule #1: Don’t invent your own coding style.
With Python, there are common guidelines and best practices for just about everything. I encourage you to read PEP 8 - Style Guide for Python Code.
You might also consider checking your code using
pep8, both of which are installable via `pip’.
Most likely, I won’t take off points for not following PEP 8 unless it interferes with readability, nevertheless, please consider this suggestion.
By the way, while you’ll notice right near the start of PEP8 is a section entitled “A Foolish Consistency is the Hobgoblin of Little Minds” (perhaps not ironically this quotation is not properly attributed to its author, Ralph Waldo Emerson, in his famous essay “Self-Reliance”). While this section provides good advice to the experienced pythonista, it’s best while learning to stick to the standards to the greatest reasonable extent. (I’ve been coding in Python for almost 20 years now, and I stick to these standards, deviating only in specific cases with solid justification and not on a whim.)