Open Side Menu Go to the Top
Register
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** ** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD **

12-19-2017 , 03:53 PM
Difference between int and string has to be a troll.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
12-19-2017 , 04:04 PM
Quote:
Originally Posted by Larry Legend
Difference between int and string has to be a troll.
I've definitely seen some variants of that - maybe not literally knowing the difference between string and integer but people who don't realize, for instance, that typing something into a textbox and having the application process it as a number necessarily entails conversion.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
12-19-2017 , 04:13 PM
Quote:
Originally Posted by Larry Legend
Difference between int and string has to be a troll.
I am extremely sad to report that it was a factual statement.

Quote:
Originally Posted by candybar
They are not "seasoned engineers" and I don't think you could extrapolate much from this experience. What are you doing there?
I’m not doing any software engineering, so unrelated work. But their poor skills directly translate into a worse experience for me and the people who give us money, so...

Of course, even when I look at companies we routinely deal with, it only seems marginally better. Perhaps it’s a perspective issue. As candybar noted, I may just have my expectations all wrong.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
12-19-2017 , 04:14 PM
If strongly typed languages means I can take a data file, chop out 100 columns of unused data and have it work the same in an R app, I'm all for it. This change the size of the input data set generates an error is pretty annoying.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
12-19-2017 , 04:52 PM
Quote:
Originally Posted by Loki
Is there anywhere that has, like, good software? Or will basically any piece of software that has more than two people working on it always be doomed to end up a steaming pile of poo? Like, besides the simple Linux programs that do one or two things and does them well, are there engineers at google who look at their code base and think, “the horror?”
I have a few friends who work at google, who swear the code base is pristine. Of course they each only work on a small part of the whole. I've worked a few places that I thought had pretty good code bases. One of those places, to my surprise, one of our founders said "if we gave our code base to our competitors, it would set them back 6 months" (of course, he was advocating for replacing it with the shiny new project he and his team were working on, and he eventually did)

The website I work on is pretty good - but it has a small number of hands in it. In it's entire lifetime probably 3 backend devs have worked on it, maybe 3 or 4 FE, and probably 2/3 of it was written by 1 FE and 1 BE.

I have worked places with people I would consider bad-to-mediocre. In fact that's usually when I decide to leave a place, when the people I think do good work start to leave, and the mediocre guys are left, and especially when they backfill the good guys with either more mediocre guys or only junior devs.

I have worked places where people would argue about how long a bug should take to fix, when I'd chime in and say, ok, I fixed it, can we move on?

The place I work now has a newish cloud version (maybe 1 year old or less?) that replaced an older "appliance" version (in some cases that's literal, to some customers we literally ship them a box that they plug into their network). The appliance is considered to be ugly, badly written, etc. People don't like to work on it, and in fact it became such a problem that they moved development of it to an offshore unit. It's since been moved back, because we still make a lot of money on it, and the offshore situation wasn't working out (apparently)

The CEO made a pretty good point about it, which is, that "bad" software built our company, it's made many millions of dollars, it continues to outsell our shiny new cloud version.

Most software is "bad" in some way. It usually seems especially bad if you didn't write it yourself.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
12-19-2017 , 05:26 PM
This is turning into alt.sysadmin.recovery...
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
12-19-2017 , 06:21 PM
Quote:
Originally Posted by Loki
Sadly, I think it’s a case of the latter. Or is it normal for seasoned engineers to not be able to tell the difference between an integer and a string? I feel like it’s not normal.
Do you mean "normal" as in "not unusual" or "wtf?" If it's the first definition, it's pretty common. I'm not sure how someone with 15+ years of programming experience ends up not being able to program, but it was it is.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
12-19-2017 , 06:32 PM
Quote:
Originally Posted by Barrin6
My brother interviewed at my company for Software Engineer. Took us 3 weeks and they somehow are going to throw him an offer for Data Engineer instead.

Anyone know if it's hard to switch back to Software Engineer later down the road? He doesn't have any offers yet, hoping to hear back more soon by mid January. Googling shows a lot of people trying to go from SE to DE, but not vice versa.
Data engineering is a great first step to becoming a remote contractor working on ****ed-up code bases.

Show me your flowcharts and conceal your tables, and I shall continue to be mystified. Show me your tables, and I won’t usually need your flowcharts; they’ll be obvious. -- Fred Brooks.

A "data engineer" is a highly ambiguous role.

1- You can be hired to open up Mohalo and click "good" and "bad" for 8 hours a day, sending the "bad" to the "level 2 data engineers."

2- You can effectively be a Linux person writing Python scripts to do ETL from every data file type you can imagine to a single large database.

3- You can be staging massive queries for a data scientist

4- You can effectively be a Jr DBA, rewriting queries, views, procedural functions, and migrations and doing minor database tuning.

I consider numbers 2 and 3 a standard data engineer job.

No one self-respecting does number 1 for long.

Number 4 means you are probably taking on a label to justify low pay.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
12-19-2017 , 08:57 PM
Totally agree that data engineer is really ambiguous. But I don’t really think your examples are very comprehensive.

To me the data engineer role implies very large amounts of data (you’re not dealing with single machines or databases). The job is about working with that large amount of data in reliable ways and making it accessible for various purposes: data scientists, customer facing features, data analysts, etc.

In my view of the role you’re much more likely to be using hive/spark/redshift/<other big data tool> then writing queries against a standard database.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
12-19-2017 , 09:29 PM
You have to de-Dickensian the view of the world when you read DaveTs posts before trying to apply them to your own.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
12-19-2017 , 10:50 PM
Quote:
Originally Posted by kerowo
You have to de-Dickensian the view of the world when you read DaveTs posts before trying to apply them to your own.


Well done. Legit out loud chuckle from that.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
12-19-2017 , 10:59 PM
Quote:
Originally Posted by kerowo
You have to de-Dickensian the view of the world when you read DaveTs posts before trying to apply them to your own.
💯
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
12-19-2017 , 11:25 PM
Quote:
Originally Posted by jjshabado
Well done. Legit out loud chuckle from that.
I did too. Thanks Kerowo, it's probably the first time I chuckled this month.

jj, I think two of my examples sort of dealt with that. There is the data scientist assistant, then there was the massive ETL converter.

But do note that it's very difficult for any company to legitimately get to the scale that they need big data and have to use all the tools you mention to manage it. For most reasonably sized (and engineered) applications, you really don't need it, but that doesn't mean you aren't dealing with many gigs of data with high throughput, accuracy, and i/o needs.

I also would hope that a Jr data engineer isn't going to be thrown into terabytes of data. There is so much nuance to data management that, unless the applicant has dealt with a single database (of at least 50 screwed up tables), they really shouldn't be given the keys to larger data sets. (and before I'm asked, there is no book on this, you just have to **** it up yourself and learn to fix it).

I can name at least a few companies that deal with many terabytes of data using PostgreSQL, and ugh, so sorry for them that they started that when Slony was the only viable option.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
12-19-2017 , 11:48 PM
Quote:
Originally Posted by daveT
I did too. Thanks Kerowo, it's probably the first time I chuckled this month.
me_irl
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
12-20-2017 , 12:12 AM
It’s really not that difficult to get to the scale where big data tools are the right choice. Many industries (finance, adtech, ecommerce, social media, etc.) have common problems that need these tools. There’s a reason that they’re growing so fast in usage and development.

Also, it’s just nonsense that someone has to deal with a single database before dealing with larger data sets. The intersection of relevant skills is minimal.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
12-20-2017 , 12:34 AM
I suppose that I'll give two examples of companies that I know for certain use relational databases.

First, when you search for a home on Zillow, all of Zillow's clones, and any realtor you heard of, you are actually querying the exact same data from a single company. This locates the millions of homes on sale, a subset near you, broken down by price, size, location, realtor, and so on. Granted, they may not be Google, but I know for a fact that this is all relational.

When you search for a song on Apple, Spotify, YouTube, use Kazaam, and so on, all that metadata is relayed by a relational database from one of two companies.

I know of a few other companies that deal with climate data, and they all strictly use relational databases.

They all need "data engineers," but hey, it's not NoSQL, so not legit. I'll walk stage left now.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
12-20-2017 , 01:16 AM
We have a data engineer at my company and his job is essentially trying to understand how a lot of incredibly unreliable queries were built and re-writing them and improving data quality to provide consistent results across over a hundred database tables.

I don't envy him.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
12-20-2017 , 03:47 AM
My first programming job was using SAS for a statistical consulting firm to create reports and charts and graphs from various disparate data sources (as the defendant side in major lawsuits ). I guess I was a data engineer. It was a pretty cool job.

I've drawn on what I learned there countless times in subsequent jobs. I can look at a graph or dataset and immediately spot ambiguities or potential issues. Nothing is more rewarding than taking opposing counsel's analysis and ripping it to shreds.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
12-20-2017 , 05:32 AM
Quote:
Originally Posted by Larry Legend
We have a data engineer at my company and his job is essentially trying to understand how a lot of incredibly unreliable queries were built and re-writing them and improving data quality to provide consistent results across over a hundred database tables.

I don't envy him.
It's actually not that bad as long as no one forces you to use a GUI.

I rather enjoy this kind of work, but you know, Dickens.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
12-20-2017 , 07:27 AM
Quote:
Originally Posted by daveT
I suppose that I'll give two examples of companies that I know for certain use relational databases.
Dave, when someone agrees with you that a term is vague but says your definitions aren’t comprehensive, providing two examples where you think you’re right doesn’t add anything.


Quote:
Originally Posted by daveT
They all need "data engineers," but hey, it's not NoSQL, so not legit. I'll walk stage left now.

You should walk stage left because you’re doing your typical thing where you’re arguing against something nobody is saying.

And, yeah, all sorts of stuff is in transactional databases. Even big data! But in this case your examples of houses and songs generally aren’t even big data. It’s just not anywhere close to the magnitude of data that’s recorded by the industries I listed earlier.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
12-20-2017 , 12:46 PM
Quote:
Originally Posted by jjshabado
And, yeah, all sorts of stuff is in transactional databases. Even big data! But in this case your examples of houses and songs generally aren’t even big data. It’s just not anywhere close to the magnitude of data that’s recorded by the industries I listed earlier.
This - I think dave is confusing the number of data access requests with the amount of data. Zillow and Youtube almost certainly have big data (any kind of raw user engagement data), but housing and song metadata don't accumulate at a particularly high rate.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
12-20-2017 , 12:56 PM
And if we get into talking about data access requests, we're almost certainly not talking about a single instance of the database. There are almost certainly some number of read replicas that are serving the majority of the requests.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
12-20-2017 , 01:51 PM
Quote:
Originally Posted by jjshabado
And if we get into talking about data access requests, we're almost certainly not talking about a single instance of the database. There are almost certainly some number of read replicas that are serving the majority of the requests.
Exactly - and for a well-scaled system, a minuscule percentage of the requests are even going to go there - most are gonna just hit the cache. The actual backend is for maintaining the integrity of the data and for a small number of users (generally internal systems and definitely not end-users of the sites) that do need online access.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
12-20-2017 , 02:32 PM
Even big data is ambiguous. If it's not a large data set, then maybe it's fast queries, or maybe it's fast writes, or maybe just say **** it and use redis.

I think there should be totally different names for the various data roles. The vast majority of jobs I've seen that are labeled data engineer is over relatively small data sets and more often than not data munging and ETL.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote
12-20-2017 , 03:01 PM
Big data is ambiguous, but I think its fair to say if it it can be stored (or processed) on a single physical machine of reasonable cost - you're not dealing with big data.

I'm going to assume "vast majority" is just hyperbole or you don't have a very representative view of the world. But, nonetheless, I'm sure there are many advertised roles like that. My personal view is that writing sql and looking at queries for a database isn't data engineering in the same way that cleaning data isn't data science.

But, obviously, that's subjective and just my opinion. I acknowledge there's a real good argument to be made that 'data engineering' or 'data science' is just whatever people with those job titles do - in which case it probably includes working with small data sets.
** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** Quote

      
m