Abstract
Conventional encrypted deduplication approaches retain the deduplication capability on duplicate chunks after encryption by always deriving the key for encryption/decryption from the chunk content, but such a deterministic nature causes information leakage due to frequency analysis. We present
- [1] . 2013. Message-locked encryption for lock-dependent messages. In Proceedings of the Annual Cryptology Conference (CRYPTO’13). 374–391.Google Scholar
Cross Ref
- [2] . 2002. FARSITE: Federated, available, and reliable storage for an incompletely trusted environment. In Proceedings of the 5th USENIX Symposium on Operating Systems Design and Implementation (OSDI’02). 1–14.Google Scholar
Cross Ref
- [3] . 1992. Origins of cryptology: The Arab contributions. Cryptologia 16, 2 (1992), 97–126.Google Scholar
Cross Ref
- [4] . 2015. Identifying trends in enterprise data protection systems. In Proceedings of the 2014 USENIX Annual Technical Conference (USENIX ATC’15). 151–164.Google Scholar
- [5] . 2010. Fast and secure laptop backups with encrypted de-duplication. In Proceedings of the 24th USENIX International Conference on Large Installation System Administration (LISA’10). 1–8.Google Scholar
- [6] . 2022. SMHasher. https://github.com/aappleby/smhasher.Google Scholar
- [7] . 2015. Transparent data deduplication in the cloud. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security (CCS’15). 886–900.Google Scholar
Digital Library
- [8] . 2017. Side channels in deduplication: Trade-offs between leakage and efficiency. In Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security (ASIACCS’17). 266–274.Google Scholar
Digital Library
- [9] . 2007. Provable data possession at untrusted stores. In Proceedings of the 14th ACM SIGSAC Conference on Computer and Communications Security (CCS’07). 598–609.Google Scholar
Digital Library
- [10] . 2004. How far can we go beyond linear cryptanalysis? In Proceedings of the International Conference on the Theory and Application of Cryptology and Information Security (ASIACRYPT’04). 432–450.Google Scholar
Cross Ref
- [11] . 2015. Interactive message-locked encryption and secure deduplication. In Proceedings of the IACR International Workshop on Public Key Cryptography (PKC’15). 516–538.Google Scholar
Cross Ref
- [12] . 2013. DupLESS: Server-aided encryption for deduplicated storage. In Proceedings of the 22nd USENIX Security Symposium (Security’13). 179–194.Google Scholar
- [13] . 2013. Message-locked encryption and secure deduplication. In Proceedings of the Annual International Conference on the Theory and Applications of Cryptographic Techniques (EUROCRYPT’13). 296–312.Google Scholar
Cross Ref
- [14] . 2013. DepSky: Dependable and secure storage in a cloud-of-clouds. ACM Transactions on Storage 9, 4 (2013), 1–33.Google Scholar
Digital Library
- [15] . 2014. SCFS: A shared cloud-backed file system. In Proceedings of the 2014 USENIX Annual Technical Conference (USENIX ATC’14). 169–180.Google Scholar
- [16] . 2009. Extreme binning: Scalable, parallel deduplication for chunk-based file backup. In Proceedings of the 2009 IEEE International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems (MASCOTS’09). 1–9.Google Scholar
Cross Ref
- [17] . 2018. The tao of inference in privacy-protected databases. Proceedings of the VLDB Endowment 11, 11 (2018), 1715–1728.Google Scholar
Digital Library
- [18] . 2006. Compare-by-hash: A reasoned analysis. In Proceedings of the 2006 USENIX Annual Technical Conference (USENIX ATC’06). 85–90.Google Scholar
- [19] . 2004. Convex Optimization. Cambridge University Press.Google Scholar
Digital Library
- [20] . 1997. On the resemblance and containment of documents. In Proceedings of the Conference on Compression and Complexity of Sequences (SEQUENCES’97). 21–29.Google Scholar
- [21] . 2018. ALACC: Accelerating restore performance of data deduplication systems using adaptive look-ahead window assisted chunk caching. In Proceedings of the 16th USENIX Conference on File and Storage Technologies (FAST’18). 309–324.Google Scholar
- [22] . 2005. An improved data stream summary: The count-min sketch and its applications. Journal of Algorithms 55, 1 (2005), 58–75.Google Scholar
Digital Library
- [23] . 2002. Pastiche: Making backup cheap and easy. In Proceedings of the 5th USENIX Symposium on Operating Systems Design and Implementation (OSDI’02). 285–298.Google Scholar
Cross Ref
- [24] . 2018. A bandwidth-efficient middleware for encrypted deduplication. In Proceedings of the 2018 IEEE Conference on Dependable and Secure Computing (DSC’18). 1–8.Google Scholar
Cross Ref
- [25] . 2002. Reclaiming space from duplicate files in a serverless distributed file system. In Proceedings of the 22nd IEEE International Conference on Distributed Computing Systems (ICDCS’02). 617–624.Google Scholar
Cross Ref
- [26] . 2014. Distributed key generation for encrypted deduplication: Achieving the strongest privacy. In Proceedings of the 2014 ACM on Cloud Computing Security Workshop (CCSW’14). 57–68.Google Scholar
Digital Library
- [27] . 2005. A Framework for Analyzing and Improving Content-based Chunking Algorithms.
Technical Report HPL-2005-30(R.1). Hewlett-Packard Laboratories.Google Scholar - [28] . 2022. FSL Traces and Snapshots Public Archive. from http:// tracer.filesystems.org/.Google Scholar
- [29] . 2022. LevelDB. https://github.com/google/leveldb.Google Scholar
- [30] . 2006. Attribute-based encryption for fine-grained access control of encrypted data. In Proceedings of the 13th ACM SIGSAC Conference on Computer and Communications Security (CCS’06). 89–98.Google Scholar
Digital Library
- [31] . 2022. GNUMP: GNU multiple precision arithmetic library. Retrieved from https://gmplib.org/.Google Scholar
- [32] . 2017. Leakage-abuse attacks against order-revealing encryption. In Proceedings of the 2017 IEEE Symposium on Security and Privacy (S&P’17). 655–672.Google Scholar
Cross Ref
- [33] . 2011. Proofs of ownership in remote storage systems. In Proceedings of the 18th ACM Conference on Computer and Communications Security (CCS’11). 491–500.Google Scholar
Digital Library
- [34] . 2010. Side channels in cloud services: Deduplication in cloud storage. IEEE Security & Privacy 8, 6 (2010), 40–47.Google Scholar
Digital Library
- [35] . 2022. Hard Drive Theft Sees Data of 1 Million Individuals Exposed. https://www.hipaajournal.com/hard-drive-theft-sees-data-1-million-indiv iduals-exposed-8859/.Google Scholar
- [36] . 2022. Data Age 2025. https://www.seagate.com/files/www-content/our-story/trends/files/idc-se agate-dataage-whitepaper.pdf.Google Scholar
- [37] . 2022. Intel advanced encryption standard (AES) new instructions set. https://www.intel.com.bo/content/dam/doc/white-paper/advanced-encryptio n-standard-new-instructions-set-paper.pdf.Google Scholar
- [38] . 2009. The effectiveness of deduplication on virtual machine disk images. In Proceedings of the 2009 ACM International Conference on Systems and Storage (SYSTOR’09). 1–12.Google Scholar
Digital Library
- [39] 2007. PORs: Proofs of retrievability for large files. In Proceedings of the 14th ACM SIGSAC Conference on Computer and Communications Security (CCS’07). 584–597.Google Scholar
Digital Library
- [40] . 2003. Plutus: Scalable secure file sharing on untrusted storage. In Proceedings of the 1st USENIX Conference on File and Storage Technologies (FAST’03). 29–42.Google Scholar
- [41] . 2014. Introduction to Modern Cryptography. Chapman and Hall/CRC.Google Scholar
Cross Ref
- [42] . 2015. Frequency-hiding order-preserving encryption. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security (CCS’15). 656–667.Google Scholar
Digital Library
- [43] . 2004. On-the-fly verification of rateless erasure codes for efficient content distribution. In Proceedings of the 2004 IEEE Symposium on Security and Privacy (S&P’04). 226–240.Google Scholar
Cross Ref
- [44] . 1951. On information and sufficiency. Annals of Mathematical Statistics 22, 1 (1951), 79–86.Google Scholar
Cross Ref
- [45] . 2018. Frequency-smoothing encryption: Preventing snapshot attacks on deterministically encrypted data. IACR Transactions on Symmetric Cryptology 2018, 1 (2018), 277–313.Google Scholar
Cross Ref
- [46] . 2016. Order-revealing encryption: New constructions, applications, and lower bounds. In Proceedings of the 23rd ACM SIGSAC Conference on Computer and Communications Security (CCS’16). 1167–1178.Google Scholar
Digital Library
- [47] . 2022. Enabling secure and space-efficient metadata management in encrypted deduplication. IEEE Trans. Comput. 71, 4 (2022), 959–970.Google Scholar
- [48] . 2020. Information leakage in encrypted deduplication via frequency analysis: Attacks and defenses. ACM Transactions on Storage 16, 1 (2020), 1–30.Google Scholar
Digital Library
- [49] . 2016. Rekeying for encrypted deduplication storage. In Proceedings of the 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’16). 618–629.Google Scholar
Cross Ref
- [50] . 2017. Information leakage in encrypted deduplication via frequency analysis. In Proceedings of the 47th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’17). 1–12.Google Scholar
Cross Ref
- [51] . 2020. Balancing storage efficiency and data confidentiality with tunable encrypted deduplication. In Proceedings of the 15th European Conference on Computer Systems (EuroSys’20). 1–15.Google Scholar
Digital Library
- [52] . 2015. CDStore: Toward reliable, secure, and cost-efficient cloud storage via convergent dispersal. In Proceedings of the 2015 USENIX Annual Technical Conference (USENIX ATC’15). 111–124.Google Scholar
- [53] . 2013. Improving restore speed for backup systems that use inline chunk-based deduplication. In Proceedings of the 11th USENIX Conference on File and Storage Technologies (FAST’13). 183–197.Google Scholar
Digital Library
- [54] . 2009. Sparse indexing: Large scale, inline deduplication using sampling and locality. In Proceedings of the 7th USENIX Conference on File and Storage Technologies (FAST’09). 111–123.Google Scholar
- [55] . 2015. Secure deduplication of encrypted data without additional independent servers. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security (CCS’15). 874–885.Google Scholar
Digital Library
- [56] . 2018. Secure deduplication of encrypted data: Refined model and new constructions. In Proceedings of the 2018 Cryptographers’ Track at the RSA Conference (CT-RSA’18). 374–393.Google Scholar
Cross Ref
- [57] . 2013. File recipe compression in data deduplication systems. In Proceedings of the 11th USENIX Conference on File and Storage Technologies (FAST’13). 175–182.Google Scholar
Digital Library
- [58] . 2021. Charon: A secure cloud-of-clouds system for storing and sharing big data. IEEE Transactions on Cloud Computing 9, 4 (2021), 1349–1361.Google Scholar
Cross Ref
- [59] . 2011. A study of practical deduplication. ACM Transactions on Storage 7, 4 (2011), 1–20.Google Scholar
Digital Library
- [60] . 2011. Dark clouds on the horizon: Using cloud storage as attack vector and online slack space. In Proceedings of the 20th USENIX Conference on Security (Security’11). 5–5.Google Scholar
- [61] . 2004. Number-theoretic constructions of efficient pseudo-random functions. J. ACM 51, 2 (2004), 231–262.Google Scholar
Digital Library
- [62] . 2015. Inference attacks on property-preserving encrypted databases. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security (CCS’15). 644–655.Google Scholar
Digital Library
- [63] . 2022. OpenSSL: Cryptography and SSL/TLS Toolkit. https://www.openssl.org/.Google Scholar
- [64] . 1999. Low-cost double-size modular exponentiation or how to stretch your cryptoprocessor. In Proceedings of the IACR International Workshop on Public Key Cryptography (PKC’99). 223–234.Google Scholar
Cross Ref
- [65] . 1981. On the complexity of integer programming. J. ACM 28, 4 (1981), 765–768.Google Scholar
Digital Library
- [66] . 2018. RARE: Defeating side channels based on data-deduplication in cloud storage. In Proceedings of the 2018 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS’18). 444–449.Google Scholar
Cross Ref
- [67] . 2013. ClouDedup: Secure deduplication with encrypted data for cloud storage. In Proceedings of the 5th IEEE International Conference on Cloud Computing Technology and Science (CloudCom’13). 363–370.Google Scholar
Digital Library
- [68] . 2017. The design and implementation of a rekeying-aware encrypted deduplication storage system. ACM Transactions on Storage 13, 1 (2017), 9.Google Scholar
Digital Library
- [69] Michael O. Rabin. 1981. Fingerprinting by random polynomials. Department of Computer Science, Harvard University. Tech. Report TR-15-81. 1–12.Google Scholar
- [70] . 2011. AONT-RS: Blending security and performance in dispersed storage systems. In Proceedings of the 9th USENIX Conference on File and Storage Technologies (FAST’11). 14–14.Google Scholar
- [71] . 2016. On information leakage in deduplicated storage systems. In Proceedings of the 2016 ACM on Cloud Computing Security Workshop (CCSW’16). 61–72.Google Scholar
Digital Library
- [72] . 2015. Lamassu: Storage-efficient host-side encryption. In Proceedings of the 2015 USENIX Annual Technical Conference (USENIX ATC’15). 333–345.Google Scholar
- [73] . 1979. How to share a secret. Commun. ACM 22, 11 (1979), 612–613.Google Scholar
Digital Library
- [74] . 2014. A secure data deduplication scheme for cloud storage. In Proceedings of International Conference on Financial Cryptography and Data Security (FC’14). 99–118.Google Scholar
Cross Ref
- [75] . 2008. Secure data deduplication. In Proceedings of the 4th ACM International Workshop on Storage Security and Survivability (StorageSS’08). 1–10.Google Scholar
Digital Library
- [76] . 2009. POTSHARDS-a secure, recoverable, long-term archival storage system. ACM Transactions on Storage 5, 2 (2009), 1–35.Google Scholar
Digital Library
- [77] . 2016. A long-term user-centric analysis of deduplication patterns. In Proceedings of the 32nd IEEE Symposium on Mass Storage Systems and Technologies (MSST’16). 1–7.Google Scholar
Cross Ref
- [78] . 2012. Characteristics of backup workloads in production systems. In Proceedings of the 10th USENIX Conference on File and Storage Technologies (FAST’12). 4–4.Google Scholar
Digital Library
- [79] . 2022. Drew Perttula and Attacks on Convergent Encryption. https://tahoe-lafs.org/hacktahoelafs/drew_perttula.html.Google Scholar
- [80] . 2008. Tahoe: The least-authority filesystem. In Proceedings of the 4th ACM International Workshop on Storage Security and Survivability (StorageSS’08). 21–26.Google Scholar
Digital Library
- [81] . 2011. SiLo: A similarity-locality based near-exact deduplication scheme with low RAM overhead and high throughput. In Proceedings of the 2011 USENIX Annual Technical Conference (USENIX ATC’11). 26–30.Google Scholar
- [82] . 2016. FastCDC: A fast and efficient content-defined chunking approach for data deduplication. In Proceedings of the 2016 USENIX Annual Technical Conference (USENIX ATC’16). 101–114.Google Scholar
- [83] . 2016. Write skew and zipf distribution: Evidence and implications. ACM Transactions on Storage 12, 4 (2016), 1–19.Google Scholar
Digital Library
- [84] . 2015. VM-centric snapshot deduplication for cloud data backup. In Proceedings of the 31st IEEE Symposium on Mass Storage Systems and Technologies (MSST’15). 1–12.Google Scholar
Cross Ref
- [85] . 2012. Multi-level selective deduplication for VM snapshots in cloud storage. In Proceedings of the 5th IEEE International Conference on Cloud Computing (CLOUD’12). 550–557.Google Scholar
Digital Library
- [86] . 2017. Updatable block-level message-locked encryption. In Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security (ASIACCS’17). 449–460.Google Scholar
Digital Library
- [87] . 2015. SecDep: A user-aware efficient fine-grained secure deduplication scheme with multi-level key management. In Proceedings of the 31st IEEE Symposium on Mass Storage Systems and Technologies (MSST’15). 1–14.Google Scholar
Cross Ref
- [88] . 2008. Avoiding the disk bottleneck in the data domain deduplication file system. In Proceedings of the 6th USENIX Conference on File and Storage Technologies (FAST’08). 269–282.Google Scholar
- [89] . 2018. Mitigating traffic-based side channel attacks in bandwidth-efficient cloud storage. In Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS’18). 1153–1162.Google Scholar
Cross Ref
Index Terms
Tunable Encrypted Deduplication with Attack-resilient Key Management
Recommendations
Information Leakage in Encrypted Deduplication via Frequency Analysis: Attacks and Defenses
ATC 2019 Special Section and Regular PapersEncrypted deduplication combines encryption and deduplication to simultaneously achieve both data security and storage efficiency. State-of-the-art encrypted deduplication systems mainly build on deterministic encryption to preserve deduplication ...
Balancing storage efficiency and data confidentiality with tunable encrypted deduplication
EuroSys '20: Proceedings of the Fifteenth European Conference on Computer SystemsConventional encrypted deduplication approaches retain the deduplication capability on duplicate chunks after encryption by always deriving the key for encryption/decryption from the chunk content, but such a deterministic nature causes information ...
Weak leakage-resilient client-side deduplication of encrypted data in cloud storage
ASIA CCS '13: Proceedings of the 8th ACM SIGSAC symposium on Information, computer and communications securityRecently, Halevi et al. (CCS '11) proposed a cryptographic primitive called proofs of ownership (PoW) to enhance security of client-side deduplication in cloud storage. In a proof of ownership scheme, any owner of the same file F can prove to the cloud ...






Comments