I have developing experience of over a decade.
First time, I use DB with over 400 million total records.
Then I got new issues that I need help with (not only in DB. server 12 r2 as well).
I have only to run a calculations, so it means that the server needs to use its 100% of CPU, and generate the most calculation+insertion to DB.
I had an issue before that I got blockage by over 15 indexes I had on each table. I didn't know that indexes cost much. after consultation with friend with knowledge, he emphasize to me about "cost" of insertion (didn't know it before), and I understood the blocking..
Also for example, table weight 7GB, and it's indexes are 35GB. Since with big data, when I query, it took me much time...
I rerun the calculations, where on my unique value, I made the clustered index, and no more indexes..later added another one index on one column, and it still didn't conflict with the insertion...
After, I measured the time it took me to insert vs the time to calculate. and I saw total time of insertion is so much more than calculating...I advise with friends, and thought to use another DB (NoSQL), but one friend told me to try BCP Utility, and bulk insert...I searched about bulk insertion, and found the SqlBulkCopy of C# and after I used that, I have accumulated 1,000 calculations records, convert them to a table, and run it on the bulk copy object, and it took no time to insert.
Then I managed to enter in a pick, 9K in a seconds (where I wish to enter 100K), 308K in a minute, and nearly 18M in an hour.....
Again, with my mission if I could do 100M in one hour (wish for), it would be perfect.
I have a server with 256GB of memory, with 2 cpu of 10 cores each and 25MB cache...
This one:Intel® Xeon® Processor E5-2650 v3 (25M Cache, 2.30 GHz)
So, I have power, and this server ONLY need to do this calculations, no online, no responses to queries, and no multiple request like a website server.
I changed my software to run on Multi Threads, since after I ran it first 50 times (to start the same program 50 time and let each take another part to calculate), it used 15%-25% of the CPU power. So, I ran it on over 250 threads...Each thread has it's own connection..first, when I got to 100 connections bug/error/crash of/in the software, because it could not open more than 100 connection. So I changed in my connection string, max connection pool to 1,000.
I noticed that after the table is full (over 100M records per table), and I continue to calculate. and calculation process yield very low results (slow), as calculations per second, but CPU are nearly 100%...
I thought that when coping to table with much data, index (clustered) must check that the record don't exist (I cannot give up on uniqueness, it break my calculation when asking for results), so I am in the work on coping data to new (archive/temp) tables and then let empty table get refilled, and beside my software got error that I didn't get an output to it.
It is in the making, then query running over an hour, and I got SUSPENDED for the copy, and I ran also an index on the field that I based the query on the where clause while an insert to a new table is on the make (on another table), but both are suspended, and locked...
I wish to know if I could have in my disposal experienced DBA, that I could ask any quesiton...I am a developer with over 10 years of experience, and also experience with DB but as a developer.
I am using this platform to find the expert, but then I wish to deal directly with you.
Not for any questions I will have I will open a new project.
p.s. It is not a project to do anything, it is only to give me tips or teach me a bit some principles of DB.
Easy for me to understand.
Will love to read your comments.
Let me know, Please,
Isaac
347-430-5988
Skype:ieitani