# Neural-Networks can do anything, even Decryptable Hash Functions. But how?

Neural networks is a hot topic these days, and it is so for all the right reasons,, cause when ever we don’t know the logic to solve a problem, we started training models, and and the results are appreciable...Neural nets are been used like the Jhandu-balm, we can apply where ever we want.. 😆

I saw many people talking about neural nets, without having thorough knowledge of the math that makes them super cool.. I am one of them, my initial projects worked, but I could not understand how those results were that perfect.. just reading blogs with titles like “implement your first neural network in under a minute” is enough to get some petty projects work, but to innovate with this awesome tech, we need the clarity.

“they just work, I don’t know how” is a common phrase among students these days (and I am one of them too 😆) . But today, let us understand that “HOW”

I started a project out of the same ignorance, and I want to share my realizations

As neural nets were surprisingly performant, I confidently began a project to implement decryptable hash functions.. yes, only when we dont have enough knowledge, we can think this bold. I started working on this project , and wanted to take you through my journey,

**Introduction:** Hash functions are mathematical functions that convert inputs into compressed representations. Input to a hash function might be of any arbitrary length, and yet its output is always of a fixed length. (that’s the beauty)

Hash functions have the following special features that make them important:

1) Abstracting the length of text, as every text after encrypting gets converted to a fixed length string. We hence cannot analyze the size of the text that has been hashed..!

2) Avalanching : small changes in input have multiplied effect on the output

3) Computationally infeasible to decrypt hash

Hash functions are functionally so good that they are used for digital signatures, message authentication codes, password tables, key derivations, global unique Id generation and management, and in many other applications.

Having these many impacting features, it is obvious that hash functions are used in varied domains. But they are not enough to be used to encrypt text for communication , as hash encrypted text cannot be decrypted back. It is a one way path, reverse is blocked.

*Why can’t we decrypt a hashed text:*

Hash of a string is always same, but multiple strings can have same hash.. hence, Its not possible to reverse the function to get back to the original string.(cause while decrypting, we cannot say which text is hashed to form this code.)

**how then ??**

Hence the only way to have a decryptable hash function is by designing a function such that it generates a unique hash-code for each text . Now that is the problem…!!

For example, if the text is100 characters long, and hash-code is 50 characters long. for a language with 26 different chars,we can make 26¹⁰⁰ different 100 char long texts, while hash-codes can only represent 26⁵⁰ different combinations.

we just cant distribute 2⁵⁰ codes to 2¹⁰⁰ texts, with each text getting a unique code.so, a hash function cannot be a one to one relation (because, we are missing out the other 25⁵⁰ combinations).

so mathematically, it seems to be not possible.

**Solution: **Now , it is obvious that we cannot design a “mathematical” hash function. But what if we can represent these 26¹⁰⁰ combinations in a 50 char long string where each character can represent 2 characters in itself ? (or even more as per requirement) .sounds stupid right? but seriously, this could be a way of designing a decryptable hash..

Isn’t this what** **** auto-encoders** do? they take a sequence of numbers, and convert the sequence into a smaller sequence (latent space representation), and can be retrieved back…! But How do they work?

**How auto-encoders are able to do this ?**

To see it in simpler terms, auto encoders take sequences of numbers and convert those sequences into a smaller sequences. Data is obviously lost in this situation too, but what’s better here is , a sequence of *integers* is converted into a smaller sequence of *real numbers*, and thus the data loss is too less as compared to converting the sequence of integers into a smaller sequence of integers again.

auto encoders are not rule based programs, they depend on ** predicting** the original sequence from latent representation. Mind it, they

**, not**

*PREDICT***When ever prediction is involved, probability comes up and thus, the authenticity is not assured.**

*CONVERT.*But, despite this, auto encoders perform well for us. the mathematical reason for this is that

this means each unit vector in the reduced space actually has more information in it compared to a unit vector in the original space .. This is confusing right..auto encoders decrease the dimensions of vectors, but at the same time, each dimension in the reduced space represents an overlap of multiple domains.

if we see a sequence of integers as a vector and encode it using an auto encoder, its latent space representation is a smaller vector(sequence) represented in real-numbers. so, even when length of sequence is reduced, the data loss is compensated to some extent.

simplest example:

lets take a vector(sequence of integers) : 1,2,3,4,5,6,7,8,9,0

now we can reduce of the vector’s dimension to 5, without losing any data., we can represent the sequence as: 1.2,3.4,5.6,7.8,9.0

the data wasn’t lost because the dimension of unit vector is increased. and that’s what the autoencoders are good at..

Hence if we over-fit the auto encoding model, there is a fair chance that the latent space representations can be decrypted back with 100% accuracy…!

did we overpower math? probably…

haha, you’re convinced ? then probably you need to learn stuff , go .