image by Andrew Ridley
Use beginning and end of string in regular expressions
We often validate user input using regular expressions.
I'm co-chairing RailsConf 2024 in Detroit May 7–9.
Come and join us
There are lots of regular expressions on the Internet. Every now and then we might ‘borrow’ one to save ourselves the life-sapping pain of creating one anew.
However, we should beware.
Instead of…
…using ^
and $
to enclose the regular expression.
# A regular expression matching a
# string of lowercase letters
/^[a-z]+$/
Use…
…\A
and \z
.
# A regular expression matching a
# string of lowercase letters
/\A[a-z]+\z/
But why?
Being specific in this case will reduce potential security holes in your code.
The characters ^
and $
match the beginning and end of a line, not the beginning and end of an entire string.
If your validations are not precise you could allow potentially dangerous user input to be permitted.
For example:
> "word\n<script>run_naughty_script();</script>".match?(/^[a-z]+$/)
=> true
> "word\n<script>run_naughty_script();</script>".match?(/\A[a-z]+\z/)
=> false
The string above, with its potentially harmful JavaScript, gets through the looser validation of ^
and $
. You certainly don’t want to let that sort of code to potentially run on your site.
Why not?
This is a case where being specific is important. Just do it.
Still running UK’s friendliest, Ruby event on Friday 28th June.
Ice cream + Ruby
Last updated on June 10th, 2018 by @andycroll
An email newsletter, with one Ruby/Rails technique delivered with a ‘why?’ and a ‘how?’ every two weeks. It’s deliberately brief, focussed & opinionated.