optimization - Optimize data upload on GoogleBigQuery -


i'm using google bigquery platform uploading many datas (~ > 6 go) , work them datasource tableau desktop software. presently takes me average of 1 hour upload 12 tables in csv format (total of 6 go), uncompressed, python script using google api. google docs specify "if loading speed important app , have lot of bandwidth load data, leave files uncompressed.". how can optimize process ? should solution compressed csv files improve upload speed ? think using google cloud storage, expect problem same? need reduce time it's take me upload data files, don't find great solutions.

thanks in advance.

compressing input data reduce time upload data, increase time load job execute once data has been uploaded (compression restricts our ability process data in parallel). since sounds you'd prefer optimize upload speed, i'd recommend compressing data.

note if you're willing split data several chunks , compress them each individually, can best of both worlds--fast uploads , parallel load jobs.

uploading google cloud storage should have same trade-offs, except 1 advantage: can specify multiple source files in single load job. handy if pre-shard data suggested above, because can run single load job specifies several compressed input files source files.


Comments

Popular posts from this blog

c# - DetailsView in ASP.Net - How to add another column on the side/add a control in each row? -

javascript - firefox memory leak -

Trying to import CSV file to a SQL Server database using asp.net and c# - can't find what I'm missing -