Esporta (0) Stampa
Espandi tutto
Questo argomento non è stato ancora valutato - Valuta questo argomento

Abstractions

Aggiornamento: ottobre 2011

Author: http://msdn.microsoft.com/it-it/library/hh307529.aspx

Immagine cui si fa riferimento

Learn more about RBA Consulting.

Abstractions

Windows Azure Storage persists data entities in a distributed manner, which means that the storage is scalable and always available to consumers. To read and write data to the underlying storage system, you use abstractions of familiar data objects. For example, files and large streams of data are stored as blobs. In total, there are four object abstractions that satisfy most, if not all, of an application's requirements. The abstractions are blob, drive, table, and queue services.

Blobs

The blob service allows you to store any type of data, such as, but not limited to, strings and byte arrays. It is comprised of the following three resources:

  • Account - The account is the storage account for the subscription.

  • Container - Each storage account can have zero or more blob containers. Each container is made up of properties and metadata. You can also choose to create a root container for your blob service. The benefit of a root container is that you do not need to provide the root container name in the URI. Your account can only have one root container and it must be named $root.

  • Blob(s) - Each container can contain zero or more blob entities. The blob contains properties and metadata, as well as the content itself.

Each of these resources can be addressed by a unique URI that uses the following pattern:

http://<account>.blob.core.windows.net/[<container>/[<blob>]]

For example, if your storage account name is mystorageaccount123, and the account has a container named container1, which has a blob called blob999, then the resources would have the following URIs:

  • http://mystorageaccount123.blob.core.windows.net - This URI accesses the account.

  • http://mystorageaccount123.blob.core.windows.net/container1 - This URI accesses the container.

  • http://mystorageaccount123.blob.core.windows.net/container1/blob999 - This URI accesses the blob.

When you store a blob, you must choose either the block or the page blob. The two types of blobs allow you to optimize the type of entity that you are storing. For example, you can optimize a streaming video file by storing it as a block blob, which stores the video in chunks. A file that requires ranged reads and writes, such as a rolling log file, should be stored as a page blob.

Block blobs are blobs that are made up of one or more blocks. Each block is identified with a block ID and can be up to four megabytes (MB) in size. To write to a block blob takes two steps. You first upload it, and then you commit it. You can store up to 50,000 blocks in a single blob, which is approximately 200 GB. Block blobs are well-suited to streaming data because you can deliver the blocks over time.

Page blobs allow you to store more data than block blobs. The data is stored in pages, which you can access with random read and write operations. Each page blob can be up to 1 terabyte in size. All pages must align at 512-byte boundaries. Unlike block blobs, write operations against page blocks are committed immediately. You must specify the maximum size of the page blob when you create it. After the page blob is created, you can write to it by specifying the offset and a range of bytes. Page blobs are useful when you need fast and random access to a file.

Windows Azure also allows you to create snapshots of your block or page blobs. A snapshot of a blob is a read-only copy of the base blob. You can read, copy or delete snapshots but you cannot modify them. You can use blob snapshots in various ways. For example, a snapshot can act as backup of a blob. You can modify the base blob after the snapshot is taken, and roll it back to the snapshot. It is recommended that you only maintain one snapshot of each blob to avoid unintentional charges. Snapshots of blobs can accrue charges for your account if the base blob is updated.

Because each resource is addressable, Windows Azure provides an access control list (ACL) to limit or prevent access to the blobs and containers. ACLs are applied at the container level. By default, each container and its blobs can only be accessed by the owner of the storage account, or access can be delegated using a Shared Access Signature. However, you can grant anonymous users the ability to read the blob content and container metadata. The following table lists the different ACL options for blobs and their containers.

 

ACL

Enumerate blobs in a container

Read blob data

No public read access (default)

Account owner only

Account owner only

Public read access for blobs only

Account owner only

Account owner and anonymous users

Full public read access

Account owner and anonymous users

Account owner and anonymous users

Instead of using a five-point hostname such as mystorageaccount123.blob.core.windows.net, you can provide your own domain names. You may want to do this to simplify the address. For example, a long resource URI such as http://mystorageaccount123.blob.core.windows.net/container1/blob999 might be replaced with http://sub.mydomain.com/container1/blob999, if you own the mydomain.com domain. For more information on how to register your custom domain with Windows Azure, see " How to Register a Custom Subdomain Name for Accessing Blobs in Windows Azure " at http://msdn.microsoft.com/en-us/library/ee795179.aspx.

Drives

A Windows Azure drive emulates a true hard drive and provides the server on which it is mounted with a local and durable NTFS volume. Code that runs in your role can use any NTFS API to access the drive. The drive uses a page blob to read and write the data, which allows the data from flushed writes to persist regardless of the health or status of the role. Windows Azure drives can be from between 16 MB to 1 terabyte in size. Your role can dynamically mount up to 16 drives at run time. To ensure the best performance, the drive should be mounted in the same region as the hosted service. All I/O transactions will occur within the same data center, which lowers the latency period. A Windows Azure drive is essentially a hard drive in the cloud, and it is useful for third-party applications that must be long-lived and that store data and application state in a drive. For example, a third-party application such as MySQL is a good candidate for a drive. You can also mount a Windows Azure drive for use as a local cache to improve the performance of your application. To configure your role to mount a drive at design time, you must modify the service definition file (ServiceDefinition.csdef) of your application to define a new LocalStorage element. The following code is an example of how to modify the file.

<ServiceDefinition xmlns="http://schemas.microsoft.com/ServiceHosting/2008/10/ServiceDefinition">
  <WebRole name="MvcWebRole1" >
    ...
    <LocalResources>
      <LocalStorage name="MyCacheResource" sizeInMB="10000"/>
    </LocalResources>
  </WebRole>
</ServiceDefinition>

This configures a new resource named MyCacheResource with 10 GB of storage. You may need to review your role's intended size to determine how much local disk space you need. The following table lists the VM sizes that Windows Azure offers, and the amount of disk space for each size.

 

VM Size

Disk Space for Local Storage Resources

ExtraSmall

20 GB

Small

225 GB

Medium

490 GB

Large

1000 GB

ExtraLarge

2040 GB

After you set the configuration file, modify your role's code to initialize the cache. The following code is an example of how to do this.

VB

Dim cache as LocalResource
cache = RoleEnvironment.GetLocalResource("MyCacheResource")
CloudDrive.InitializeCache(cache.RootPath, cache.MaximumSizeInMegabytes)

C#

LocalResource cache = RoleEnvironment.GetLocalResource("MyCacheResource");
CloudDrive.InitializeCache(cache.RootPath, cache.MaximumSizeInMegabytes);

Tables

Tables in Windows Azure allow you to store structured data, which are named entities. Unlike a traditional relational database table, Windows Azure tables do not have a fixed schema that defines a static set of columns and data types. Instead, each entity is a collection of properties that are stored as key-value pairs. Therefore, it is possible for two entities in the same table to have different sets of properties. It is even possible for two entities in the same table to have the same property but each entity's property is of a different data type. The following code is an example. The AddressEntity1 class has a property named CountryCode that is a string. The AddressEntity2 class has a property named CountryCode that is an integer.

VB

Imports System 
Public Class AddressEntity1

    '  Gets or sets the entity's PartitionKey value
    Public PartitionKey As String
 
    ' Gets or sets the entity's RowKey value
    Public RowKey As String
 
    ' Gets or sets the entity's Timestamp value
    Public Timestamp As DateTime

    ' Gets or sets the country code as a string value such as "US" or "CAN"
    Public CountryCode As String
 
    ' Gets or sets a tags value for the address
    Public Tags As String
 
End Class

Imports System
Public Class AddressEntity2

    '  Gets or sets the entity's PartitionKey value
    Public PartitionKey As String
 
    ' Gets or sets the entity's RowKey value
    Public RowKey As String
 
    ' Gets or sets the entity's Timestamp value
    Public Timestamp As DateTime

    ' Gets or sets the country code as a string value such as 1 for the United States
    Public CountryCode As Integer
 
    ' Gets or sets AddressEntity2 specific tags
    Public MyOwnTags As String
 
End Class

C#

public class AddressEntity1
{
    /// <summary>
    /// Gets or sets the entity's PartitionKey value
    /// </summary>    
    public string PartitionKey { get; set; }

    /// <summary>
    /// Gets or sets the entity's RowKey value
    /// </summary>    
    public string RowKey { get; set; }

    /// <summary>
    /// Gets or sets the entity's Timestamp value
    /// </summary>    
    public DateTime Timestamp { get; set; }

    /// <summary>
    /// Gets or sets the country code as a string value such as "US" or "CAN"
    /// </summary>
    public string CountryCode { get; set; }

    /// <summary>
    /// Gets or sets a tags value for the address
    /// </summary>
    public string Tags { get; set; }
}

public class AddressEntity2
{
    /// <summary>
    /// Gets or sets the entity's PartitionKey value
    /// </summary>    
    public string PartitionKey { get; set; }

    /// <summary>
    /// Gets or sets the entity's RowKey value
    /// </summary>    
    public string RowKey { get; set; }

    /// <summary>
    /// Gets or sets the entity's Timestamp value
    /// </summary>    
    public DateTime Timestamp { get; set; }

    /// <summary>
    /// Gets or sets the country code as a integer value such as 1 for the United States
    /// </summary>
    public int CountryCode { get; set; }

    /// <summary>
    /// Gets or sets AddressEntity2 specific tags
    /// </summary>
    public string MyOwnTags { get; set; }
}

The code creates the following table.

 

PartitionKey

RowKey

Timestamp

CountryCode

Tags

MyOwnTags

"Address 1"

"AddressEntity1"

2011-05-07T00:00:00

"US"

"tag 1, tag 2"

 

"Address 2"

"AddressEntity2"

2011-05-07T00:00:01

1

 

"tag 3"

As you can see, the CountryCode column displays two values with different data types. The Tags and MyOwnTags columns show values for the corresponding entities that contain the CountryCode property.

It can be confusing to view entities that have different schemas in the same table. It may be clearer to think of entities as collections of properties that are stored as key-value pairs.

Entities have some constraints. An entity cannot be larger than 1 MB, nor can it have more than 255 properties. Each entity must include three system properties: PartitionKey, RowKey, and Timestamp. These properties are important because they provide some of the information that allows the table to be scalable, ordered, and traceable.

The PartitionKey property is a string value that represents the partition to which the entity belongs. Entities with the same PartitionKey are grouped in the same partition, and they will always be served from the same storage node.

The RowKey property is also a string value. It uniquely identifies entities that are within the same partition. The Timestamp property is a read-only DateTime property that reflects the last time the entity was updated.

The combination of the PartitionKey and RowKey properties forms the single index into the table, and uniquely identifies each entity in the table. The index also defines the sort order of the entities. Entities are first sorted by the PartitionKey property, and then by the RowKey property. The properties are sorted in ascending order. Because both properties are string values, the lexical comparison that occurs during the sort may result in an unwanted sort order. For example, even if your PartitionKey properties are the integer values 1, 2, 9, and 10, they will be sorted as the strings "1", "2", "9" and "10". The following table shows the resulting sort order.

 

PartitionKey

RowKey

Timestamp

"1"

-

-

"10"

-

-

"2"

-

-

"9"

-

-

To correct the problem, pad the values with "0" to make them of fixed length. For example, use "0001", "0002", "0009", and "0010". The RowKey and Timestamp properties are omitted for clarity.

Queues

A queue is used for durable messaging within Windows Azure. A durable message is one that is written to a stable store during processing to ensure persistence even during a computer failure or restart. Queues allow one part of your application to enqueue messages that can later be dequeued by another part of the application. The storage service makes a best effort to dequeue messages in first-in-first-out (FIFO) order. However, FIFO is not guaranteed because of various factors such as durability and scalability. Each message in a queue can be up to 8 KB in size. If you need to store something larger, the recommended action is to store the large message in a blob, and then reference the blob within a queued message. There is no limit on how many messages you can store in a single queue, except for the overall limit of 100 terabytes for the entire storage account.

If you have worked with traditional queues, you may be familiar with how queues generally allow you to perform a transactional read of a message, where, in the event that the consumer fails, the message is never committed and the message remains in the queue. Windows Azure queues provide this type of durability by manipulating the visibility of the message. When you read a message from a queue, the message is immediately marked as invisible for a period of time. The default is thirty seconds. The consumer should process the message while it is invisible. If the processing is successful, then the consumer should delete the message from the queue. However, if processing fails for whatever reason, the message is never deleted. When the invisibility period passes, the message is again marked as visible. Because it is possible for an application to read a message that has been marked as visible multiple times, the order in which messages are read is not guaranteed to be the same as the order in which they went into the queue. In other words, when you read from a queue, do not assume that they are in FIFO order. In addition, you should also ensure that the processing of the messages is idempotent. Idempotency will allow your application to process the same message multiple times without affecting the true and intended outcome of processing such message.

Il documento è risultato utile?
(1500 caratteri rimanenti)
Grazie per i commenti inviati.
Mostra:
© 2014 Microsoft. Tutti i diritti riservati.