** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** - Page 1082 - Computer Technical Help

Loop each char and build a new string ignoring a-z's. Much easier for someone else to understand, much easier to modify and maintain and I'm sure will be more performant as well.

Strongly disagree here. If I saw this in a code base, I'd say, "Why did you just write 3 or 5 lines of code instead of using a regex?" It's literally tailor made for the problem.

You guys are talking about them as if they're some beastly algorithm that no one can be expected to internalize. But basic ones are incredibly simple and not hard to remember -- and that's coming from a programmer with a terrible memory.

And so what if you have to quickly google or check in an online tester? If, eg, I haven't programmed ruby for a while I have to google if upper-casing is "upcase" or "to_upper" or "uppercase". That I have to look it up isn't an argument against its being the correct solution to the problem. I'd never roll my own solution so I wouldn't have to be bothered googling.

Quote

03-24-2017 , 10:51 AM

#27028

plexiq

veteran

Join Date: Apr 2007 Posts: 2,554

Quote:

Originally Posted by Gullanian

Agree with this for most stuff, but even simple looking regex's can have n squared performance.

Jeff Atwood did a post on it:
https://blog.codinghorror.com/regex-performance/

Small strings, small patterns sometimes = big meaningful time sinks. I always avoid using Regex if I can. Could call it premature optimisation but it's safer long term as well if someone else comes along who doesn't know what they are doing and modifies the expression horribly which is not unusual!

It's an interesting blog post, but this does not seem like an actual problem in practice tbh. The example in the blog looks heavily constructed and the solution is pretty much common sense:

Quote:

The solution is simple. When nesting repetition operators, make absolutely sure that there is only one way to match the same match.

It's not like you will easily stumble into some terribly performing regexp by accident, but if you can find a realistic example I'd be really curious to see it.

Maybe don't nest repetition operators at all if you are really paranoid about it. But avoiding regexp entirely because you may need to profile them for performance once in a decade seems like a terrible trade-off.

Quote

03-24-2017 , 10:54 AM

#27029

gaming_mouse

Carpal \'Tunnel

Join Date: Oct 2004 Posts: 13,786

Quote:

Originally Posted by jjshabado

Interesting post!

It is, although it's hard to see how anyone moderately competent would ever write the test regex:

Code:

(x+x+)+y

instead of the equivalent:

Code:

(x+)+y

maybe there's a less contrived example that you might construct in the wild, but in years of using regexes i've never encountered this.

also note modern JS seems to handle it fine:

https://regex101.com/r/IasbGO/2

Quote

03-24-2017 , 11:00 AM

#27030

jjshabado

Carpal Tunnel

Join Date: Jul 2006 Posts: 22,732

Sometimes you code for the incompetent person (or intern) that comes after you. ;P

I honestly don't feel strongly about regexes except to the point that they should virtually never be an interview question.

Quote

03-24-2017 , 11:14 AM

#27031

gaming_mouse

Carpal \'Tunnel

Join Date: Oct 2004 Posts: 13,786

Quote:

Originally Posted by jjshabado

Sometimes you code for the incompetent person (or intern) that comes after you. ;P

this general philosophy, taken to the extreme (and i'd consider avoiding all regexes a small example of taking this to the extreme) is a virtual guarantee of a poor code base.

the rich hickey distinction between "easy" and "simple" is relevant here. proper use of high-level constructs will make things simpler. but if people aren't familiar with those constructs they won't be "easy."

the trivial example here is using "map" instead of a "for" loop. some percent of junior programmers who would have been able to read the "for" loop will be confused by "map"

but so what? you can't run a software company by dumbing everything down with the worst programmers in mind. hiring such people has already doomed you.

Quote

03-24-2017 , 11:21 AM

#27032

jjshabado

Carpal Tunnel

Join Date: Jul 2006 Posts: 22,732

Quote:

Originally Posted by gaming_mouse

this general philosophy, taken to the extreme (and i'd consider avoiding all regexes a small example of taking this to the extreme) is a virtual guarantee of a poor code base.

Any philosophy taken to the extreme is a virtual guarantee of poor outcomes.

I think we disagree on: "i'd consider avoiding all regexes a small example of taking this to the extreme".

Also, I don't think anyone actually advocated for that. There are clear cases where regexes are the tool for the job. Simple string manipulation isn't a clear case, imo.

Quote

03-24-2017 , 12:00 PM

#27033

Gullanian

Carpal \'Tunnel

Join Date: Dec 2006 Posts: 14,014

I think Regex by it's nature complicates code most of the time. Understand if people feel I'm wrong here maybe my brain is just not wired up properly to read them fluently.

In C# for the simple char removal you can do:

Code:

   myString.Where(c => !chars.Contains(charArray));

Which imo is a nice middle ground between writing your own loop and regex

Quote

03-24-2017 , 01:28 PM

#27034

gaming_mouse

Carpal \'Tunnel

Join Date: Oct 2004 Posts: 13,786

Quote:

Originally Posted by Gullanian

my brain is just not wired up properly to read them fluently.

knowing what i know about you, i think 30-60 minutes of effort + using them every day for a few days would make them (at least, simple ones like the exercism snippet) as readable to you as the C# snippet.

Quote

03-24-2017 , 01:31 PM

#27035

gaming_mouse

Carpal \'Tunnel

Join Date: Oct 2004 Posts: 13,786

Quote:

Originally Posted by jjshabado

i think we're on the same page except for exactly where the line is drawn.

Quote

03-24-2017 , 02:01 PM

#27036

jjshabado

Carpal Tunnel

Join Date: Jul 2006 Posts: 22,732

I feel like we should argue longer.

Spoiler:

Quote

03-24-2017 , 02:25 PM

#27037

RustyBrooks

Carpal \'Tunnel

Join Date: Feb 2006 Posts: 24,647

Quote:

Originally Posted by gaming_mouse

Code:

(x+x+)+y

instead of the equivalent:

Code:

(x+)+y

Those aren't equivalent. For example, this string does not match the first one, but does match the 2nd.

xy

The first regexp requires at least 2 'x's and the 2nd only 1.

Quote

03-24-2017 , 02:51 PM

#27038

saw7988

Carpal \'Tunnel

Join Date: Jul 2005 Posts: 11,463

what's the point of x+x+ vs just xx+?

ETA: actually what's the point of (x+)+ vs just x+?? (I clearly don't ever regex)

Quote

03-24-2017 , 03:00 PM

#27039

Gullanian

Carpal \'Tunnel

Join Date: Dec 2006 Posts: 14,014

Test string:
xxxxxxxxxxx

On:
https://regex101.com/

Greedy matches:
x+ = 1 match, 3 steps
x+x+ = 1 match, 4 steps
xx+ = 1 match, 3 steps
(x+)+ = 1 match, 7 steps
(x+x+)+ = 1 match, 7 steps

Nongreedy:
x+ = 11 matches, 23 steps
x+x+ = 5 matches, 15 steps
xx+ = 5 matches, 15 steps
(x+)+ = 11 matches, 46 steps
(x+x+)+ = 5 matches, 25 steps

On web services/sites, if you use nongreedy matching I think you need to be very careful what you do as it might be a possible attack vector for a ddos, by designing some inputs that take hundreds of ms to parse.

Last edited by Gullanian; 03-24-2017 at 03:09 PM.

Quote

03-24-2017 , 03:14 PM

#27040

suzzer99

Save the Cheerleader, Save the World

Join Date: Nov 2005 Posts: 99,984

My first language was Perl.

regex

Quote

03-24-2017 , 03:52 PM

#27041

OmgGlutten!

Pooh-Bah

Join Date: Aug 2016 Posts: 5,273

In this Ruby code...

if dna.chars.any? {|char| char =~ /[^CGAT]/}

unless dna.chars.all? {|char| char =~ /[CGAT]/}

are the / coming from ruby or regex?

i don't see it referenced here http://www.freeformatter.com/regex-tester.html

Quote

03-24-2017 , 05:27 PM

#27042

gaming_mouse

Carpal \'Tunnel

Join Date: Oct 2004 Posts: 13,786

Quote:

Originally Posted by RustyBrooks

Those aren't equivalent. For example, this string does not match the first one, but does match the 2nd.

xy

The first regexp requires at least 2 'x's and the 2nd only 1.

right, good catch. still, i think my point stands as wouldn't you rewrite it like:

Code:

x{2,}y

The way it's written in the example seems purposefully unsemantic.

Quote

03-24-2017 , 05:30 PM

#27043

gaming_mouse

Carpal \'Tunnel

Join Date: Oct 2004 Posts: 13,786

Quote:

Originally Posted by OmgGlutten!

Just like you define a literal string by enclosing it with quotes like "some string", you define a literal regex by enclosing it with forward slashes.

What was the context of that code? At at a glance it seems odd. Normally you'd use chars.any? OR a regex, but not both together.

EDIT: eg, if you want to check if the dna string contains any invalid chars just do:

Code:

dna =~ /[^CGAT]/

if all chars are valid it will return nil, which evaluates to false in ruby. if there is any invalid char it will return its integer position within the string, which will evaluate to true, even if its position is 0.

Last edited by gaming_mouse; 03-24-2017 at 05:38 PM.

Quote

03-24-2017 , 05:39 PM

#27044

goofyballer

Carpal \'Tunnel

Join Date: Jun 2005 Posts: 70,719

Quote:

Originally Posted by jjshabado

I feel like we should argue longer.

Spoiler:

I was aghast to see 43% of people in the latest SO developer survey prefer tabs to spaces. Those people are clearly idiots.

Quote

03-24-2017 , 05:42 PM

#27045

gaming_mouse

Carpal \'Tunnel

Join Date: Oct 2004 Posts: 13,786

Quote:

Originally Posted by goofyballer

I was aghast to see 43% of people in the latest SO developer survey prefer tabs to spaces. Those people are clearly idiots.

so, i use spaces. but i also have this nagging feeling that the tabs people are logically correct. a tab is a logical unit of indentation. the visual display of that unit should be decoupled from the logical unit (you might like wide, while i prefer narrow). is there any good rejoinder to that argument?

Quote

03-24-2017 , 05:44 PM

#27046

gaming_mouse

Carpal \'Tunnel

Join Date: Oct 2004 Posts: 13,786

Quote:

Originally Posted by jjshabado

I use vim bindings in most of my IDEs, but I generally use the IDE specific commands for things like search and replace.

we definitely need to argue longer

Quote

03-24-2017 , 09:29 PM

#27047

codeartisan

centurion

Join Date: Jun 2008 Posts: 181

Quote:

Originally Posted by goofyballer

I was aghast to see 43% of people in the latest SO developer survey prefer tabs to spaces. Those people are clearly idiots.

Anyone who cut their teeth editing makefiles (in vi, of course) will always prefer tabs, as makefiles required tabs.

Vi and tabs are superior.

QED.

Quote

03-24-2017 , 10:43 PM

#27048

jjshabado

Carpal Tunnel

Join Date: Jul 2006 Posts: 22,732

Quote:

Originally Posted by gaming_mouse

Hah. I agree completely with this.

Quote

03-25-2017 , 08:40 AM

#27049

saw7988

Carpal \'Tunnel

Join Date: Jul 2005 Posts: 11,463

Quote:

Originally Posted by gaming_mouse

Amazing. This. I use spaces as well (probably cuz python?), but the best arguments I see for tabs >> best arguments I see for spaces. And there's the philosophical points you touch on. It basically never comes up with modern editors, but I've always thought tabs vs spaces was an interesting discussion nonetheless.

Quote

03-25-2017 , 12:32 PM

#27050

goofyballer

Carpal \'Tunnel

Join Date: Jun 2005 Posts: 70,719

I think g_m's argument is solid in theory but encounters trouble in practice...
- if you need indentation of less than a full unit (i.e. maybe your tabs are set to 4 but when long lines spill over you try to match something w/ the line above or only indent 2, then...oops (say your tabs are set to 2 but you're trying to line something up with the line above 4 spaces in - if you do that indent w/ 2 tabs instead of 4 spaces, you run into the below problem immediately)
- this is more a problem with people who won't stick to tabs if that's what's agreed upon, but I've never seen a file with mixed tabs and spaces where the tabs didn't wind up making **** look really wrong as soon as you open it on a different tab setting than the person who wrote the tabs

Quote

Page 1082 of 1603

First

82 582 882 982 1032 1062 1072 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1092 1102 1132 1182 1282 1582

Last

Post Reply Subscribe

...

Page 1082 of 1603

First

82 582 882 982 1032 1062 1072 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1092 1102 1132 1182 1282 1582

Last