** UnhandledExceptionEventHandler :: OFFICIAL LC / CHATTER THREAD ** - Page 464 - Computer Technical Help

Dave, I'm not sure I understand what you're doing exactly but a number of years ago I had a bunch of data in Excel that needed to be moved because performance was too bad. Moving the data into Access from Excel took only a couple of hours and gave me an order or two of magnitude better performance.

Quote

12-19-2013 , 11:40 PM

#11577

Chips Ahoy

help me help you

Join Date: Apr 2007 Posts: 28,869

I have used EPPlus to read/write excel in .net. You don't even need excel. You can do linq queries on it, or just export to a db.

Quote

12-20-2013 , 12:16 AM

#11578

daveT

S.A.G.E. Master

Join Date: Jun 2005 Posts: 23,955

The problem is multi-fold. The least of which is that no one at my job, despite building their own computers and being pretty darn good at Excel, is technical beyond that point.

I have to maintain and track a crap-ton of product listings: over 200,000 individual listings and about 25,000 photos. The tools of this trade? You guessed it: Excel and FTP. Yeah, it is as miserable as it sounds. FileZilla is a roaring piece of trash.

The crux of it all is that some company everyone here has heard of likes to use flat files. Whoever created those things... /rant. The plus side is that they aren't strictly required for my day-to-day use, but I still have to stick with the template files. I have 4 different files that have to be properly filled out -- unless you like reading indecipherable error messages -- and they each have over 200 columns. The count, as best I remember is 212, 208, 214, 208 columns.

And finally, as I'm sure jj can attest, is that once an excel file gets over 100,000 cells, things start going sideways, which means freeze-ups, data corruption, hours of lost work, phantom values, etc etc etc.

This whole issue is compounded by the fact that I only need to fill out 60 to 70 columns on each template file, so I have a TON of blank space. As an aside, I get decent amount of entertainment watching other people try to use these files. First instinct, of course, is to grab a mouse and use that little bar on the bottom right. I'm really good at navigating these things right now, but that's because I sort of remembered the the columns I often need and use ctl-f. My boss was watching me and he says: "Why don't you use a mouse? Its so much faster."

So, in order to keep my sanity, I started working on the database. V1 was a steaming pile of trash, as it was simply two tables, both of the same category, one at 214 columns and the other at 210, and was designed to be select * and hop along. Unfortunately, I was trying to update the 210 column table from the 214 column table. That didn't go well at all. I let the update go for 90 minutes and it just didn't happen. The results reflected the 30 minutes of effort I put into doing the database, so no loss.

Since there are two categories (lets pretend it is cups and shirts) I work with, and I have 4 tables, that they can be paired into fraternal twins: they even have the same skus. Unfortunately, these pairs are only about 80% the same and the columns are entirely in different order. This is sort of a good thing but sort of a bad thing, as v1 made obvious.

First division is deconstructing the two class of tables into pairs, and then knowing that they mostly share the same information, I can deconstruct these into smaller tables. There are smaller tables here:

Dynamic -- Values that are fairly distinct to each SKU, but don't change that often. ex. product description.

Evergreen -- Values that are common and will never change. ex. where to do I work again?

Null -- It's just a place-holder for all of the columns that are normally NULL, so it is really just a huge blank table with a column of skus.

With this collection of tables, I can now write a view that extracts the data that is needed and pushes out the two template files I need, in the correct order.

I can break this down one more step. Despite so many listings, there aren't that many model numbers, so there is no need to keep 200,000 rows of sku & pictures. If I can break the skus down into model numbers, I can have a model_number -> pictures relation, and then have a sku -> model_number look up table (LUT). The pictures table will only have about 1,000 rows. The LUT will be similar, so I have:

Code:

dynamic (sku, product_name, a, b, c, d...)

Code:

evergreen (sku, info1, info2, info3...)

Code:

null (sku, nv1, nv2, nv3...)

Code:

pictures_via_sku (sku, pic1, pic2, pic3...)

Code:

pictures_via_model_number (model_number, pic1, pic2, pic3...)
-- unfortunately, I don't have all the model numbers prepared yet, but this will be a HUGE win once it is completed. The pictures, names, and paths change constantly ---

Code:

sku_model_lookup_table (sku, model_number)

the pks are all bound to the SKU. There is actually a serial counter and sub-class column on the dynamic table, which precedes the SKU in the order, so the dynamic table looks like this. The counter serves no other purpose but to sort the products, since it does matter. The ptype simply serves as a simple way to sub-select. I could just as well do a LUT on product types as well... but YAGNI:

Code:

dynamic (pnumber, ptype, sku, product_name, a, b, c, d...)

finally:

Code:

view1 on 
select 
sku, a, b, c, d, e, f....
from all_them_tables

Code:

view2 on
select 
sku, d, a, c, f, g, r, info1....
from all_them_tables

and that is the intended structure... The LUT will be quite valuable for many other areas I have to work with, but hopefully this gives a decent overview of the situation. Oh, right, did I forget to mention that I have to do to the above 2x?

Interestingly, the view generates the template file I need in about 17 seconds. The huge monster relation took up to 3 times as long. The problem is that it still takes about 5 minutes for a full exportation, which I guess is a habit I can easily live without.

Quote

12-20-2013 , 12:35 AM

#11579

daveT

S.A.G.E. Master

Join Date: Jun 2005 Posts: 23,955

Quote:

Originally Posted by ItalianFX

Started watching Indie Game: The Movie tonight (I've seen it before) to try and grind out some steam trading cards. It has gotten me interested in learning code (again).

I played with Java a while ago so I know more about Java than other languages. Last night I found a free Udemy course on C# so I'll probably go through that. I also found a pdf of "Head First Java" so I'll probably go through that.

I've seen Notch programming Minecraft in Java and also saw him using Dart (?) during the Ludum Dare contest.

I don't have any direct direction on what I would want to do, but I think it would be cool to create a game.

But secondly, in an imaginary world, let's say I wanted to become so good at coding that I could apply to Google, Facebook, etc. Which languages would be best for that? Which ones are highest in demand? I'm not saying this would realistically happen, but if I was to just dive in and immerse myself in coding, which would be the best way to go about doing that?

Lastly, for whichever language that is, are there any standard books that could I could follow along on my computer? I have watched videos on Youtube from thenewboston (on Java). I don't find a lot of other tutorials to be very good.

I have Eclipse for Java, and to follow the C# course I will be getting Visual Studio (I think is the name off hand). The C# course is estimated to take about 4 hours.

Those newboston tutorials is how I originally "learned" to program. They may be the best on youtube, but they are terrible.

4 hours to learn C# is an absurd notion.

start here:

https://www.edx.org/course/harvardx/...-computer-1022

and here:

https://www.edx.org/course/mitx/mitx...-computer-1122

then:

http://ocw.mit.edu/courses/electrica...s-spring-2005/

and after that, you should be able to figure out what to do.

Quote

12-20-2013 , 01:14 AM

#11580

ItalianFX

Carpal \'Tunnel

Join Date: Nov 2005 Posts: 16,311

Quote:

Originally Posted by daveT

The 4 hrs for C# is only an overview/introduction for beginners.

Anyway, so those links...is there any difference between taking the course and just auditing the course? I'm not sure how much time I can really devote to staying current with the progress of the course due to my work schedule (I work nights).

I was looking at some of those MIT courses (3rd link), but it sounds like I could benefit from that first one on Computer Science.

I've never taken an edX course so I don't know how it works.

EDIT: I just registered so that's a start in the right direction.

Last edited by ItalianFX; 12-20-2013 at 01:25 AM.

Quote

12-20-2013 , 07:48 AM

#11581

derada4

old hand

Join Date: Feb 2011 Posts: 1,544

So I'm now ~2 months into my first job as a software developer. It's going great, but man, not once have I run into a situation where I said to myself "oh yeah, I learned this in school!" Such a waste.

Quote

12-20-2013 , 09:37 AM

#11582

adios

Carpal \'Tunnel

Join Date: Sep 2002 Posts: 22,530

Quote:

Originally Posted by derada4

What kind of assignments are you getting in general terms?

Quote

12-20-2013 , 10:33 AM

#11583

jjshabado

Carpal Tunnel

Join Date: Jul 2006 Posts: 22,732

Dave, how do other people interact with this giant Excel document? What do they need from it? Its pretty obvious it should be a database at this point and it sounds like the main blocking point is that other people will have to use it.

I'd lean towards using something like Mongo to hold the documents and then write a ****ty little web app to give users the functionality they need (assuming this is things like looking up details of an item). Hell, you could maybe even just start with displaying the json prettily since thats easy enough to read compared to viewing a row in a giant Excel table. And if your company is ok with it you can use a cloud solution like MongoLabs so you don't even have to worry about maintaining the machine.

Edit: I also want to add that if you direct your efforts to moving this to a real database I would think that you wouldn't need to worry nearly as much about optimizing/tuning the tables/data.

Last edited by jjshabado; 12-20-2013 at 10:41 AM.

Quote

12-20-2013 , 11:14 AM

#11584

Larry Legend

Celtic Pride

Join Date: Jul 2009 Posts: 42,920

Quote:

Originally Posted by jjshabado

seems like the most obv decision OAT, unless you wanna do it more sophisticated.

Quote

12-20-2013 , 01:19 PM

#11585

MrWooster

veteran

Join Date: Mar 2007 Posts: 2,984

Quote:

Originally Posted by derada4

School (or University where I come from), should teach you how to think, not how to code. There are so many different coding styles, and so many new technologies year on year that it would be impossible for a school to keep up. Instead it should teach you how to approach a problem.

Quote

12-20-2013 , 01:24 PM

#11586

tyler_cracker

Carpal \'Tunnel

Join Date: Apr 2005 Posts: 15,828

wooster,

not true for vocational schools, which is essentially what op went to.

Quote

12-20-2013 , 02:00 PM

#11587

MrWooster

veteran

Join Date: Mar 2007 Posts: 2,984

Quote:

Originally Posted by tyler_cracker

wooster,

not true for vocational schools, which is essentially what op went to.

True. But if all you learn at a vocational school is how to perform a particular skill, you have missed out on a lot IMO. I studied civil engineering for 4 years (which I would consider vocational) and, while I now know more than I could ever want to about how to make concrete

, the much more valuable lesson for me was how to approach a problem and solve it.

Quote

12-20-2013 , 02:03 PM

#11588

Neil S

King of the sidebar

Join Date: Sep 2004 Posts: 19,680

Professional schools are not the same as vocational schools.

Quote

12-20-2013 , 02:05 PM

#11589

MrWooster

veteran

Join Date: Mar 2007 Posts: 2,984

My understanding of Vocational is that it means you study for a particular profession??

Quote

12-20-2013 , 02:25 PM

#11590

jjshabado

Carpal Tunnel

Join Date: Jul 2006 Posts: 22,732

I've never really heard "vocational" applied to professional careers like engineering, law, medicine, etc.

Quote

12-20-2013 , 02:35 PM

#11591

MrWooster

veteran

Join Date: Mar 2007 Posts: 2,984

I think its more commonly used to refer to traditional skill based jobs (which IMO doesnt include CS).

FWIW, google gave me this as an official definition:

Quote:

(of education or training) directed at a particular occupation and its skills

Quote

12-20-2013 , 04:26 PM

#11592

Neil S

King of the sidebar

Join Date: Sep 2004 Posts: 19,680

Vocational schools that teach basic coding aren't teaching computer science.

Quote

12-20-2013 , 09:02 PM

#11593

Shoe Lace

Pooh-Bah

Join Date: Sep 2004 Posts: 3,654

Have any of you guys used salt to deploy servers?
http://docs.saltstack.com/topics/

Or do most of you guys use chef/puppet?

Quote

12-20-2013 , 10:32 PM

#11594

anfernee

Won ****ty undertitle in BBB's Beanament

Join Date: Nov 2005 Posts: 5,047

Quote:

Originally Posted by Shoe Lace

Have any of you guys used salt to deploy servers?
http://docs.saltstack.com/topics/

Or do most of you guys use chef/puppet?

chef and puppet

Quote

12-20-2013 , 10:51 PM

#11595

daveT

S.A.G.E. Master

Join Date: Jun 2005 Posts: 23,955

Quote:

Originally Posted by jjshabado

Other people don't have the volume of listings that I have. I currently keep a collection of excel files in a file tree on the computer. It's hyper-organized, and aside from the corrupted files, I can find anything in 30 seconds. All 200,000 listings is pretty much me alone. I'm just lucky that I built everything from scratch and no one else has ever touched it.

While the sticking point is, in part, other people, the lion's share of it is solving my own problems. Regardless of the format, the bus factor is pretty high. The other people in my department don't have the magnitude of my work, and they also have access to a broken tool that keeps things stored and sort of organized for them.

There is one situation that a database would be super helpful for them, but I'd probably have to do all the setup so we can run the data through and spit out something more manageable, which I'm currently working on.

Quote:

I'd lean towards using something like Mongo to hold the documents and then write a ****ty little web app to give users the functionality they need (assuming this is things like looking up details of an item). Hell, you could maybe even just start with displaying the json prettily since thats easy enough to read compared to viewing a row in a giant Excel table. And if your company is ok with it you can use a cloud solution like MongoLabs so you don't even have to worry about maintaining the machine.

This is definitely an interesting idea. There's a few issues:

1- I don't know how to use Mongo. Not that I wouldn't be interested in learning about it or any NoSQL, but for a different job...

2- Someone whispers bad advice in the ears of management. I won't go into huge detail here, but it is much easier to sell PostgreSQL to them than anything else. I brought up the idea of a database a few times, but they were hesitant to use any database until they could get a crack at Oracle. This is "close enough" to Oracle that I got the green light. I'm not sure what would happen if I suggested something they likely never heard of and is unlike something they have heard of. I had the advantage of price and many other provable bullet points.

3- I don't want them to know that I can program at all. At least to everyone working there, a database is nothing more than a storage system. Letting on that I can deploy to a server or even do rudimentary programming is opening a can of worms. As far as they know, I'm just some weirdo with a blank keyboard who neglects to use a mouse like normal people.

The main idea is to only have to load into the dynamic table then insert into the other tables and let the database create the default values for me. There's not much else that I need. I'm personally much more comfortable working csv to db flows.

Quote:

Edit: I also want to add that if you direct your efforts to moving this to a real database I would think that you wouldn't need to worry nearly as much about optimizing/tuning the tables/data.

It's fast enough for now. There's always room for improvement, but I'm not looking to earn a "World's Best DBA" mug. The old version was obviously way too slow and virtually useless. One hand, I'm all for making the thing blazing fast, but on the other, I'm paid to work 8 hours a day and I'm only one person. I sort of like the schema as it stands for now and it works fine.

Quote

12-21-2013 , 01:48 AM

#11596

Barrin6

Carpal \'Tunnel

Join Date: Dec 2005 Posts: 6,947

Quote:

Originally Posted by daveT

3- I don't want them to know that I can program at all. At least to everyone working there, a database is nothing more than a storage system. Letting on that I can deploy to a server or even do rudimentary programming is opening a can of worms. As far as they know, I'm just some weirdo with a blank keyboard who neglects to use a mouse like normal people.

??? I vagely remember reading a post about what you do for a living. Didn't you graduate with a computer science major? Why are you working in a non-programming job? Why would it be a can of worms?

Quote

12-21-2013 , 02:14 AM

#11597

e i pi

veteran

Join Date: Feb 2010 Posts: 2,230

Quote:

Originally Posted by daveT

Letting on that I can deploy to a server or even do rudimentary programming is opening a can of worms. As far as they know, I'm just some weirdo with a blank keyboard who neglects to use a mouse like normal people.

Quote

12-21-2013 , 02:46 AM

#11598

daveT

S.A.G.E. Master

Join Date: Jun 2005 Posts: 23,955

Quote:

Originally Posted by Barrin6

??? I vagely remember reading a post about what you do for a living. Didn't you graduate with a computer science major? Why are you working in a non-programming job? Why would it be a can of worms?

That fish gets larger by the day...

I don't think I ever said specifically what I do for a living.