Practice
Resources
Contests
Online IDE
New
Free Mock
Events New Scaler
Practice
Improve your coding skills with our resources
Contests
Compete in popular contests with top coders
logo
Events
Attend free live masterclass hosted by top tech professionals
New
Scaler
Explore Offerings by SCALER

Hashing

Last Updated: Nov 17, 2023
Go to Problems
locked
Hashing
info
Complete all the problems in this Topic to unlock a badge
Completed
Go to Problems

Level 1

Jump to Level 2

Level 2

Jump to Level 3

Level 3

Jump to Level 4

Level 4

Jump to Level 5

Level 5

Jump to Level 6

Level 6

Jump to Level 7

Level 7

Jump to Level 8

Level 8

Be a Code Ninja!
Contents

Hashing Summary

What is the need for hashing

  • Consider a scenario where you want to design a system for storing records of employees of your organization keyed using their employee IDs. We need this system primarily to do the following 3 tasks:

    • Insert a employee ID along with corresponding employee data.
    • Search based on employee ID and fetch that employee’s details.
    • Delete a employee data based on the employee ID entered.

  • What are the data structures that we can think of for performing the above functionalities for our system efficiently? Well, let’s analyze about implementing this system using the known data structures:

    • Arrays / Linked Lists of employee ID and data:
      • If we implement the search functionality in linear fashion, it can be costly to us when we have huge number of records to be searched upon.
      • If we make sure that the arrays are maintained in sorted manner based on the employee ID, then using Binary Search, we can search for employee ID O(Logn) time. But this will have an impact on insert and delete functionalities because we need to maintain sorted order of arrays upon every insert and deleting of records. 
      • Hence this data structure is not feasible to use.
    • Balanced binary search tree (BBST) taking employee ID as keys:
      • A BBST is a binary tree in which the left and right subtrees of every node always differs in height by no more than 1.

      • Here, due to the nature of BBST data structure, we can guarantee that the search, insert and delete functionalities can take place in O(Logn) time. 

      • But, is there any more efficient way? Let’s look for other options.

    • Direct Access Table (DAT):
      • This data structure uses the O(1) retrieval time complexity property of Arrays to its advantage. Lets see how below.
      • In DAT, we make a very large array and use employee ID as index for the array.
      • We treat a value in array as null if employee ID is not present, else the value stores pointer to data mapping to the employee ID. 
      • Here, we can do all the operations - Search, Insert and Delete in O(1) time which is the best possible we can achieve.
      • But, this has many limitations.
        • The obvious major problem is we need to have a large space dedicated for this data structure. The space complexity would be huge.
        • Another problem is if we have employee ID purely as an integer of n digits, then a programming language cannot support to store its value as integer as the value increases (as the number of records increase after a particular limit).

 

Applications

Hashing has found applications in lots of fields due to its property of irreversibility and constant time access. Following are the applications of hashing:

  • Password Security and Verification
    • Due to property of “one-way” or “irreversible” hash function, the world has seen a rise in cryptographic hash functions which provides the hash value in encrypted format.
    • Security: Whenever we are using any website for the first time which requires our authentication using “Sign Up” wherein we enter our credentials to access the accounts that belongs to us, we need to be assured that our account does not fall into wrong hands. Hence, the password entered is stored in the database in the form of a hash. This is done to avoid our crucial data falling into wrong hands. 
    • Verification: When we try to login to the website from next time onwards, the password hash is computed and sent to the server for verifying whether that password belongs to us. The hash value is sent so that whenever a data is sent from client to server there is no case of sniffing. 
  • Tokenization: The same above concept explained in “Password Security and Verification” can be extended to tokenization for storing the authorization tokens of users already logged into the systems.
  • Data Structures: Hashing is commonly used in defining and developing data structures of programming languages. For example: HashMap in Java, unordered_map in C++ etc.
  • Compiler Development The compilers corresponding to programming languages save all the keywords (such as if, else, for, while, return etc) that are used by that language in the form of hash table so that the compiler can easily identify what keywords and identified are needed to be compiled during compilation process.
  • Message Digests: This is again an application of cryptographic hashes.
    • Consider a scenario where you use cloud services to store your files and data. Now you need to be assured that the third party cloud service providers dont tamper with your files. We do so by computing hash of that file/data using a cryptographic hashing algorithms. We store these computed hashes in our local machines for reference.

    • Next time whenever we download the files from cloud storages, we can directly compute the hash of the files. We match it with the locally stored hash that was previously computed.

    • If the file has been tampered by the service providers, the hash value of the file would definitely change. If it hasnt been tampered then the hash values of the downloaded file and the locally saved value would be exactly same. 

    • Most common cryptographic algorithm is SHA256. These are also called as digest algorithms. The files in the above example is considered to be a message.

  • Others:
    • Hashing is also used in various common algorithms like Rabin Karp algorithm, counting frequency of characters in a word, creating git commit codes by git tool, caching etc.
    • Apart from above mentioned applications, hashing is also commonly used in Blockchain, Machine learning for feature hashing, storing objects or files in AWS S3 Buckets, Indexing on Database for faster retrieval and many others!

FAQs

  • Are Hashing algorithms secure?

    • We have a separate branch of hashing algorithms that provide security and they are called as Secure Hashing Algorithms (SHAs). These are also called as cryptographic hashing algorithms which are widely used in computer and cloud systems for providing security to any resource.
    • The most secure hashing algorithm available is SHA256.
  • Who invented Hashing?

    • Hashing was first invented by Hans Peter Luhn in the 1940s.
    • He demonstrated his work on hashing at a six-day international conference devoted to scientific information which was scheduled in November 1958.
  • How does Hashing work?

    • Hashing mainly works by mapping data of any arbitrary size to fixed-size value with the help of a function called “hash function” and storing it in a data structure called “hash table” at the value returned by hash functions. The values returned by this function are called hash codes, digests, hash values or hashes.
    • Read more
  • How is hashing implemented in Java?

    • Java has provided inbuilt classes for the purpose of implementing hashing and they are as follows:
      • HashTable<Object, Object> : This is a synchronised (very useful in multi-threaded environments) implementation of hashing.
        • Implementation:
          • Code:
            import java.util.*; 
            
            class InterviewBit { 
               public static void main(String args[]) 
               { 
            
                   // Defining HashTable to store  
                   // String values mapped to integer keys 
                   Hashtable<Integer, String> 
                       hashTable = new Hashtable<Integer, String>(); 
            
                   // Fill the values in HashTable
                   hashTable.put(1, "InterviewBit"); 
                   hashTable.put(2, "Example"); 
                   hashTable.put(5, "for"); 
                   hashTable.put(3, "HashTable"); 
            
                   // Display the hashtable 
                   System.out.println(hashTable); 
               } 
            } 
            
            
          • Output: {1=InterviewBit, 2=Example, 5=for, 3=HashTable}
      • HashMap<Object, Object> : This is a non - synchronised (doesnt work in multi-threaded environments) and faster implementation of hashing. Here, the order of insertion of keys are not maintained.
        • Implementation:
          • Code:
            import java.util.*; 
            
            class InterviewBit { 
               public static void main(String args[]) 
               { 
            
                   // Defining HashMap to store  
                   // String values mapped to integer keys 
                   HashMap<Integer, String> 
                       hashMap = new HashMap<Integer, String>(); 
            
                   // Fill the values in HashMap
                   hashMap.put(1, "InterviewBit"); 
                   hashMap.put(2, "Example"); 
                   hashMap.put(5, "for"); 
                   hashMap.put(3, "HashMap"); 
            
                   // Display the hashMap 
                   System.out.println(hashMap); 
               } 
            } 
            
            
          • Output: {1=InterviewBit, 5=for, 3=HashMap, 2=Example}
      • LinkedHashMap<Object, Object>: Works similar to HashMap but here the order of insertion of keys are maintained.
      • ConcurrentHashMap<Object, Object>: This class works similar to HashTable class. This is synchronized, meaning it will work great with multi-threaded environments and this class is faster than HashTable as it supports efficient means of handling multiple locks on the resource.
      • HashSet<Object>: This works similar to HashMap class but here only the keys are maintained. Doesn’t support duplicate objects. Here, Order of insertion of keys is not maintained.
      • LinkedHashSet<Object> This is similar to LinkedHashMap. The class just supports the keys. Order of insertion of keys are maintained.
  • Why the results of hashing cannot be reverted back to original input?

    • Hashing was developed by keeping in mind of the security of any systems. Hence, for a good hash function, it is always important for the function to be “one-way” so that the hackers do not get easy access to the critical data.
  • When is it not advisable to use Hashing/ Hash Tables?

    • In general, hashing has excellent time complexity for Insert/ Search / Delete operations.
    • For smaller application, it is advisable to use arrays because hash table consumes more memory space.
    • Hash tables do not support some operations like iterating over the entries where the keys are within a defined range, finding the entry with the largest/smallest key and so on. In such cases, it is better to go with arrays.
  • Where is Hashing used?

    • Hashing is most widely used across many applications such as password security, password verification, tokenization, data structures and compilers of programming languages, blockchain, machine learning feature hashing and many more!

Video Courses
By

View All Courses
Excel at your interview with Masterclasses Know More
Certificate included
What will you Learn?
Free Mock Assessment
Fill up the details for personalised experience.
Phone Number *
OTP will be sent to this number for verification
+33 *
+33
Change Number
Graduation Year *
Graduation Year *
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
2027
2028
2029
*Enter the expected year of graduation if you're student
Current Employer
Company Name
College you graduated from
College/University Name
Job Title
Job Title
Engineering Leadership
Software Development Engineer (Backend)
Software Development Engineer (Frontend)
Software Development Engineer (Full Stack)
Data Scientist
Android Engineer
iOS Engineer
Devops Engineer
Support Engineer
Research Engineer
Engineering Intern
QA Engineer
Co-founder
SDET
Product Manager
Product Designer
Backend Architect
Program Manager
Release Engineer
Security Leadership
Database Administrator
Data Analyst
Data Engineer
Non Coder
Other
Please verify your phone number
Edit
Resend OTP
By clicking on Start Test, I agree to be contacted by Scaler in the future.
Already have an account? Log in
Free Mock Assessment
Instructions from Interviewbit
Start Test