The tool aims at developing an efficient way of uploading large size blobs, as blocks of data to the Windows Azure Blob storage. If the upload fails in the middle, the tool should resume the upload from where it failed (rather than starting over).Solution Approach
The file to be uploaded is broken down into individual blocks. Each block has a specified size (as specified by the user), and is specified by a unique ID. Each block is scoped by a blob name (which is same as the file name). The upload process uploads each block individually and maintains the information of the current uploading position (specified by the stream position in bytes) in an XML file (uploadmetadata.xml created in the directory from which the tool executable was invoked). Following is a sample of this XML file:
<uploadinfo filename="D:\test\test.exe" streampos="34862440" isUploadComplete="true" />
<uploadinfo filename="D:\test\test1.exe" streampos="276216" isUploadComplete="false" />
In this XML file, each input file which is to be uploaded is specified by an “uploadinfo
” node which specifies the:
- Input Filename (“filename” attribute),
- Current uploading stream position in bytes (“streampos” attribute). If the upload fails in the middle, next time upload resumes from the position specified by this attribute.
- The completion status of the upload(“isUploadComplete” attribute)
The entire upload process is completed in 2 stages:
- Uploading individual blocks, where each block is specified by a block ID (specified by the PutBlock API in Fig 1)
- Committing all the uploaded blocks, once the entire file is uploaded (specified by PutBlockList API in Fig 1)
A GUI based tool has been developed. The tool required the following input from the user:
- Account Name: Windows Storage account name
- Account Key: Windows Storage account key
- URI: Endpoint URI of the BLOB storage service
- Container Name: BOB Storage container, where the blob needs to be uploaded to. The tool creates the container if it does not exists
- Input File: The name of the file to be uploaded.
- Block Size: The size of each individual block
When the user populates all the fields and clicks on the Upload button, the upload process starts. The current status of the upload can be seen in the status progress bar.following is a snapshot of the tool:Open Issues
- The application behavior is erroneous, while uploading an already uploaded blob, to a different container.
- The default timeout value for each upload request is 60 sec. If the user gets a “Server Time Out” error, he/she should choose a lower value of block size.