Network Working Group D. Brasher Internet-Draft Interlinux LTD Intended status: Informational July 26, 2010 Expires: January 27, 2011 Long Term Archive Storage Protocol draft-brasher-ltasp-03 Abstract Long-term archiving storage fundamentally begins with archive data Accumulation, then Geo-duplication and then Management. Using A->G->M, LTASP has been created to solve mid-range and below, long- term archiving requirements of the small-medium enterprise. Where tape has been deployed in the past, LTASP now offers an alternative solution designed to be more robust and manageable in the long term than network attached storage devices or simple disk storage alone. Status of this Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on January 27, 2011. Copyright Notice Copyright (c) 2010 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of Brasher Expires January 27, 2011 [Page 1] Internet-Draft Long Term Archive Storage Protocol July 2010 the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Architecture . . . . . . . . . . . . . . . . . . . . . . . . . 3 2.1. Disk storage structure . . . . . . . . . . . . . . . . . . 3 2.1.1. LTASP storage units . . . . . . . . . . . . . . . . . 3 2.1.2. Replication . . . . . . . . . . . . . . . . . . . . . 4 2.1.3. Node roles . . . . . . . . . . . . . . . . . . . . . . 4 2.1.4. Archive date range . . . . . . . . . . . . . . . . . . 4 2.1.5. Storage structure tables . . . . . . . . . . . . . . . 4 2.2. Data transfer mechanisms . . . . . . . . . . . . . . . . . 6 2.2.1. Lowest maximum bandwidth (LMB) . . . . . . . . . . . . 7 2.2.2. Data transfer timing . . . . . . . . . . . . . . . . . 8 2.2.3. Phases . . . . . . . . . . . . . . . . . . . . . . . . 8 2.2.4. Data flow, table . . . . . . . . . . . . . . . . . . . 8 2.2.5. Hyper virtual autochanger (HVA) . . . . . . . . . . . 10 2.2.6. Fill mechanism . . . . . . . . . . . . . . . . . . . . 11 3. Security considerations . . . . . . . . . . . . . . . . . . . 11 3.1. Passwords . . . . . . . . . . . . . . . . . . . . . . . . 11 3.2. User space . . . . . . . . . . . . . . . . . . . . . . . . 11 3.3. Application layer . . . . . . . . . . . . . . . . . . . . 11 3.4. Checksum . . . . . . . . . . . . . . . . . . . . . . . . . 12 3.5. Virtual private network . . . . . . . . . . . . . . . . . 12 3.6. Encrypted partitions, logical volumes and volumes . . . . 12 4. Community project and UK trademarks . . . . . . . . . . . . . 12 5. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . 12 6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 12 7. Change log . . . . . . . . . . . . . . . . . . . . . . . . . . 12 8. Informative references . . . . . . . . . . . . . . . . . . . . 13 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 13 Brasher Expires January 27, 2011 [Page 2] Internet-Draft Long Term Archive Storage Protocol July 2010 1. Introduction The architecture of LTASP has been designed to; maximise archive storage capacity, availability, restoration and recovery speed, scalability, modularity in code and storage resilience; to minimise operating resource overheads, impact of network outages and management overheads; to simplify the code development cycle, deployment, data recovery and integration with existing systems. LTASP architecture consists of two main parts. The disk storage structure and the data transfer mechanisms. The data transfer mechanisms in turn consists of four algorithms. An algorithm responsible for collecting data and three that are collectively known as the HVA (hyper virtual auto-changer). Data transfer calculations are based on LMB (lowest maximum bandwidth) between storage nodes. An LTASP node can be located on any computer, probably a server. Nodes are named and allocated roles which are abstract and can be changed. LTASP is designed to operate on three distinct operating system nodes and at the same time allowing more than one instance of LTASP to exist on each node. LTASP I/O operation is asynchronous (non-blocking) but the architecture design co-ordinates and records transfer activity. 2. Architecture 2.1. Disk storage structure 2.1.1. LTASP storage units LTASP storage units are roughly equivalent to the volume created on tape. Most backup software products generate something equivalent to the tape volume on disk. These disk volumes are organised to minimise the duplication of data to conserve space into full, differential and incremental types. For constant data streams, for example CCTV footage, the volumes are all effectively type full. LTASP is designed to store full and differential types. The disk storage structure is designed to store a full once a month and its corresponding differentials or a full followed by x number of months of matching differential volumes. Some control is available when choosing a name for newly generated volumes. If the type of data is streamed and cannot be differentiated against a full then the option to not collect full volumes is available. Then all volumes are full but archived in the same way as differentials. Brasher Expires January 27, 2011 [Page 3] Internet-Draft Long Term Archive Storage Protocol July 2010 2.1.2. Replication The disk structure ensures volumes are replicated across three operating systems which can be located within IP sight of each other over LAN, WLAN or WAN. This provides geographical resilience. The structure on the disk is organised into slots which can be directories. 2.1.3. Node roles The nodes are are allocated one of three roles; A, B or C. The roles a machines allocated can be changed in future. This allows machines to be re-located. For each node year slots are created, then month slots created and day slots including the special slots ad0 and aFull$$, where $ is an integer between 0 and 2. Full01 will store a Full volume at the beginning of the month and skip d1. Full02 is there for additional redundancy and to cope with the scenario where the current month is the last (this is not default behaviour). The special slot d0 has more than one purpose and is described in more detail in the section describing the data transfer mechanisms. 2.1.4. Archive date range The start year and end year can be chosen according to storage requirements. Retrospective archiving can be achieved by archiving in the past by creating a LTASP disk structure from any chosen date in the past. Retrospective archiving allows the user to transfer archives from old tapes into LTASP. The number of years of archiving required in the future is selectable. 2.1.5. Storage structure tables These diagrams show the disk storage structure. The root of the structure is a n*years then filled with months then days including the special slots. Brasher Expires January 27, 2011 [Page 4] Internet-Draft Long Term Archive Storage Protocol July 2010 Slots for each year, extendable and pre 2009 is retrospective +-------+--------+--------+--------+ | Slots | node A | node B | node C | +-------+--------+--------+--------+ | | ... | ... | ... | | | 2006 | 2006 | 2006 | | | 2007 | 2007 | 2007 | | | 2008 | 2008 | 2008 | | | 2009 | 2009 | 2009 | | | 2010 | 2010 | 2010 | | | 2011 | 2011 | 2011 | | | ... | ... | ... | +-------+--------+--------+--------+ Table 1: Year slots Slots for each month located in each year +-------+--------+--------+--------+ | Slots | node A | node B | node C | +-------+--------+--------+--------+ | | mth1 | mth1 | mth1 | | | mth2 | mth2 | mth2 | | | mth3 | mth3 | mth3 | | | mth4 | mth4 | mth4 | | | mth5 | mth5 | mth5 | | | mth6 | mth6 | mth6 | | | mth7 | mth7 | mth7 | | | mth8 | mth8 | mth8 | | | mth9 | mth9 | mth9 | | | mth10 | mth10 | mth10 | | | mth11 | mth11 | mth11 | | | mth12 | mth12 | mth12 | +-------+--------+--------+--------+ Table 2: Month slots Brasher Expires January 27, 2011 [Page 5] Internet-Draft Long Term Archive Storage Protocol July 2010 Slots in each day located in each month including special slots +--------+--------+--------+--------+ | Slots | node A | node B | node C | +--------+--------+--------+--------+ | (Dirs) | Full01 | Full01 | Full01 | | | Full02 | Full02 | Full02 | | | d0 | d0 | | | | d01 | d01 | d01 | | | d02 | d02 | d02 | | | d03 | d03 | d03 | | | d04 | d04 | d04 | | | d05 | d05 | d05 | | | d06 | d06 | d06 | | | d07 | d07 | d07 | | | d08 | d08 | d08 | | | d09 | d09 | d09 | | | d10 | d10 | d10 | | | d11 | d11 | d11 | | | d12 | d12 | d12 | | | d13 | d13 | d13 | | | d14 | d14 | d14 | | | d15 | d15 | d15 | | | d16 | d16 | d16 | | | d17 | d17 | d17 | | | d18 | d18 | d18 | | | d19 | d19 | d19 | | | d20 | d20 | d20 | | | d21 | d21 | d21 | | | d22 | d22 | d22 | | | d23 | d23 | d23 | | | d24 | d24 | d24 | | | d25 | d25 | d25 | | | d26 | d26 | d26 | | | d27 | d27 | d27 | | | d28 | d28 | d28 | | | d29 | d29 | d29 | | | d30 | d30 | d30 | | | d31 | d31 | d31 | +--------+--------+--------+--------+ Table 3: Day of month slots 2.2. Data transfer mechanisms Brasher Expires January 27, 2011 [Page 6] Internet-Draft Long Term Archive Storage Protocol July 2010 2.2.1. Lowest maximum bandwidth (LMB) [Recalibrate this section to adjust for the most recent developments] LBM = Lowest Maximum Bandwidth between any three nodes NB: actual max transfer will vary so test transfers are recommended for accuracy. LBM assumes all available bandwidth is allocated to running LTASP. Max Full01 = LMB x 6 hrs. This assumes no transfer interruptions and that the maximum bandwidth is constant. Example calculations for a month of archiving using a single full volume and a corresponding daily differentials. Ave. diff = (Sum 29 (or a month) daily differentials) / 29. Average differential is variable depending on your storage growth, this represents a trend and can be an estimate to start with, but by watching the trend of differential growth more accurate calculations can be made. It is assumed your differentials are always smaller in size than the initial full copy. Min LTASP slot size (node a) = (Max Full01 x 2) + (29 x ave. diff) + (1 x ave diff) plus 1 x ave diff to account for d0. Min LTASP slot size (node b/c) = (Max Full01 x 2) + (29 x ave diff). You can include transfer log files in the Min LTASP slot size, for simplicity they have been omitted. Example system +----------------------+------------+------------------+------------+ | Example system | LMB x 6 | Ave diff | Max | | | hrs | | aFull01 | +----------------------+------------+------------------+------------+ | LBM occurs between | 1Mbit/Sec | Estimated 500 | 2.6 GiB | | b->c | | MiB | | +----------------------+------------+------------------+------------+ Table 4: Example system Min LTASP slot size (node a) (2.6 x 2) + (29 x 0.5) + 0.5 = 20.2 GiB Min LTASP slot size (node b-c) (2.6 x 2) + (29 x 0.5) = 19.7 GiB If a copy fails then the system will retry the next day but you loose the day of failure. Logging can be used to trace successful copies. Brasher Expires January 27, 2011 [Page 7] Internet-Draft Long Term Archive Storage Protocol July 2010 2.2.2. Data transfer timing [Maybe re-write this section and incorporate into phases, or simplify and keep both sections] Two entry points, Full01 beginning of month and d0 for the remaining days. Assuming entry points are filled during the day before the cycle begins at night. Scheduled jobs split between 3 nodes, d0 is cleared after copy to d$. The system reduces single point of failure by creating a single full copy on each node at the beginning of the month then at the end of the month to cover the next 30 day diap cycle. The copies between a-a and a-b occur in the first three hours then the copy from b-c happens after three hours. These times are changed as required. Nightly copies to and between nodes are made to new slots, if due to some fail conditions a node is unavailable then the copy is not made, however the next day when communication is restored copies continue to the next nightly directory. This increases robustness over previous layout as the next nightly copy is not dependent on the success of the previous night's copy. Only two copies between nodes are made between days 3-30, day 1 a single full and day 2 full and day 30 does make an internal copy on all nodes. 2.2.3. Phases 2.2.4. Data flow, table Flow of data (* for all in column) +------------+----------------+----------------+----------------+ | Day - Time | A | B | | | D1-T=0 | Full01-> | Full01 | | | D2-T=0 | | | | | D2-T=0 | d0->d1 *(a->a) | | | | D2-T=0 | d0->d1 *(a->b) | | | | D2-T=3 | | d1->d1 | | | D3-T=0 | d0->d2 | | | | D3-T=0 | d0->d2 | | | | D3-T=3 | | d2->d2 | | | D4-T=0 | d0->d3 | | | | D4-T=0 | d0->d3 | | | | D4-T=3 | | d3->d3 | | | D5-T=0 | d0->d4 | | | | D5-T=0 | d0->d4 | | | | D5-T=3 | | d4->d4 | | | D6-T=0 | d0->d5 | | | | D6-T=0 | d0->d5 | | | | D6-T=3 | | d5->d5 | | | D7-T=0 | d0->d6 | | | Brasher Expires January 27, 2011 [Page 8] Internet-Draft Long Term Archive Storage Protocol July 2010 | D7-T=0 | d0->d6 | | | | D7-T=3 | | d6->d6 | | | D8-T=0 | d0->d7 | | | | D8-T=0 | d0->d7 | | | | D8-T=3 | | d7->d7 | | | D9-T=0 | d0->d8 | | | | D9-T=0 | d0->d8 | | | | D9-T=3 | | d8->d8 | | | D10-T=0 | d0->d9 | | | | D10-T=0 | d0->d9 | | | | D10-T=3 | | d9->d9 | | | D11-T=0 | d0->d10 | | | | D11-T=0 | d0->d10 | | | | D11-T=3 | | d10-d10 | | | D12-T=0 | d0->d11 | | | | D12-T=0 | d0->d11 | | | | D12-T=3 | | d11->d11 | | | D13-T=0 | d0->d12 | | | | D13-T=0 | d0->d12 | | | | D13-T=3 | | d12->d12 | | | D14-T=0 | d0->d13 | | | | D14-T=0 | d0->d13 | | | | D14-T=3 | | d13->d13 | | | D15-T=0 | d0->d14 | | | | D15-T=0 | d0->d14 | | | | D15-T=3 | | d14->d14 | | | D16-T=0 | d0->d15 | | | | D16-T=0 | d0->d15 | | | | D16-T=3 | | d15->d15 | | | D17-T=0 | d0->d16 | | | | D17-T=0 | d0->d16 | | | | D17-T=3 | | d16->d16 | | | D18-T=0 | d0->d17 | | | | D18-T=0 | d0->d17 | | | | D18-T=3 | | d17->d17 | | | D19-T=0 | d0->d18 | | | | D19-T=0 | d0->d18 | | | | D19-T=3 | | d18->d18 | | | D20-T=0 | d0->d19 | | | | D20-T=0 | d0->d19 | | | | D20-T=3 | | d19->d19 | | | D21-T=0 | d0->d20 | | | | D21-T=0 | d0->d20 | | | | D21-T=3 | | d20->d20 | | | D22-T=0 | d0->d21 | | | | D22-T=0 | d0->d21 | | | | D22-T=3 | | d21->d21 | | | D23-T=0 | d0->d22 | | | Brasher Expires January 27, 2011 [Page 9] Internet-Draft Long Term Archive Storage Protocol July 2010 | D23-T=0 | d0->d22 | | | | D23-T=3 | | d22->d22 | | | D24-T=0 | d0->d23 | | | | D24-T=0 | d0->d23 | | | | D24-T=3 | | d23->d23 | | | D25-T=0 | d0->d24 | | | | D25-T=0 | d0->d24 | | | | D25-T=3 | | d24->d24 | | | D26-T=0 | d0->d25 | | | | D26-T=0 | d0->d25 | | | | D26-T=3 | | d25->d25 | | | D27-T=0 | d0->d26 | | | | D27-T=0 | d0->d26 | | | | D27-T=3 | | d26->d26 | | | D28-T=0 | d0->d27 | | | | D28-T=0 | d0->d27 | | | | D28-T=3 | | d27->d27 | | | D29-T=0 | d0->d28 | | | | D29-T=0 | d0->d28 | | | | D29-T=3 | | d28->d28 | | | D30-T=0 | Full01->Full02 | | | | D30-T=0 | | Full01->Full02 | Node C | | D30-T=0 | | | Full01->Full02 | | D30-T=0 | d0->d29 | | | | D30-T=0 | d0->d29 | | | | D30-T=3 | | d29->d29 | | +------------+----------------+----------------+----------------+ Table 5: Data flow Start 00:00 - End 00:06 - T = 0(00:00) - T=3(03:00) All copies from a-a are not used in bandwidth calculations. 2.2.5. Hyper virtual autochanger (HVA) HVA is the term used to collectively describe the three algorithms that work together but operate independently on each node to ensure the data transfers occur as describe in the previous two sections. This term is derived from the term virtual auto changer. A virtual auto changer still requires hardware tape drives, 'Hyper' takes this one stage further by emulating the virtual auto changer in software. 2.2.5.1. Copy types 2.2.5.2. Leap years Brasher Expires January 27, 2011 [Page 10] Internet-Draft Long Term Archive Storage Protocol July 2010 2.2.6. Fill mechanism The filling mechanism works as follows: The start time is an integer between 0 and 11. The fill is triggered by a scheduling application like cron. Then a check is made to see if the previous days copy was successfully made. If not then an alert is made and logged for later use. If yes then a search, using a pre-defined string, is made in a directory containing the backup volumes. If full volumes have been selected for collection then a check for the day of month is made. [Currently in implementation this is day 2 - this will be settable to any day]. The name of the full volume is a pre-defined string. If a full needs to be transferred then the most recent full volume is located. A check is made to see if the full has been collected before is made. If no then the full is copied to the appropriate slot and a date and sha1sum is created and located in the slot with the volume. [see section; Security considerations, checksum for more detail]. The activity is logged and the algorithm ends. If yes then the activity is logged and the algorithm stops. If a full volume is not required to be transferred then the most recent differential volume is located using a pre-defined string. The contents of d0 are cleared. A check is made to verify the differential has not been collected before. If yes then the activity is logged and the algorithm ends. If not the the latest differential is copied to the appropriate slot and a date and sha1sum created and located in the slot with the volume. Activity is logged and the algorithm ends. 3. Security considerations In implementation it is recommended these security precautions are followed. 3.1. Passwords Do not store any passwords on file. Passwords should be stored in memory temporarily. When a password is requested the entry view is hidden. New account passwords are quality checked and a warning given if not secure. 3.2. User space Implementation to operate in user space to reduce risk of system compromise and the effect of system compromises. 3.3. Application layer Handle network communications with OpenSSH. [ref] Generate unique RSA or better certificates. Pass-wordless certificates should be re- Brasher Expires January 27, 2011 [Page 11] Internet-Draft Long Term Archive Storage Protocol July 2010 generated often. Rsync uses OpenSSH to transfer data. Use different port to the standard SSH port 22 and individually set these for each node. add references to [RFC4251] [RFC3174] 3.4. Checksum Use sha1 checksum [ref] and date stamp volumes as they enter LTASP. This information can be used to validate the integrity of stored archives in the future. add references to [RFC4251] [RFC3174] 3.5. Virtual private network Use a virtual private network between nodes for an additional layer of security. 3.6. Encrypted partitions, logical volumes and volumes Use encrypted partitions or logical volumes to enhance physical security. Use encrypted archive volumes. The encryption may be applied by backup software initial responsible for generating the volumes 4. Community project and UK trademarks A community software implementation resides at DIASER (R) [1] A UK trademark exists for DIASER to protect the acronym for Open Source community development. 5. Conclusion To be written. 6. Acknowledgements Thanks are due to my wife Marisa and Myles McClelland and a number of individuals from various groups. Also Stephen Pelc of MPE Forth [2] for SME deployment context and advice and IPR consultancy. JISC [3] for providing technical development funding through OMII-UK [4] and ECS [5] (University of Southampton) in collaboration with Interlinux Ltd. [6] 7. Change log 26 Jul 2010 - tla updated. Brasher Expires January 27, 2011 [Page 12] Internet-Draft Long Term Archive Storage Protocol July 2010 23 Feb 2010 - Diffs from draft-brasher-diap-11.txt incorporated. 22 Feb 2010 - Updated abstract. 22 Feb 2010 - Converted from DIAP. 8. Informative references [DIASER] Brasher, D., "Distributed Internet Archiving for Educational Repositories (DIASER)", April 2009, . [DIAP] Brasher, D., "Distributed Internet Archive Protocol (DIAP)", Nov 2009, . [DIASER manual] Brasher, D., "DIASER manual", July 2010, . [1] [2] [3] [4] [5] [6] Author's Address Damian Brasher Interlinux LTD PO Box 1623 Southampton, Hampshire SO15 9AE United Kingdom Email: dbrasher@interlinux.co.uk Brasher Expires January 27, 2011 [Page 13]