Two Plus Two Publishing LLC Two Plus Two Publishing LLC
 

Go Back   Two Plus Two Poker Forums > Other Topics > Science, Math, and Philosophy

Notices

Science, Math, and Philosophy Discussions regarding science, math, and/or philosophy.

Reply
 
Thread Tools Display Modes
Old 05-17-2012, 05:35 PM   #1
veteran
 
masque de Z's Avatar
 
Join Date: Aug 2009
Location: Stanford, CA USA
Posts: 3,331
Information content of long streams of numbers

Imagine i gave you a random 20 digit number say ;

43958610982765353257

and asked you to represent it in a more compact manner eg in the sense of say

30!+2^57=265252859812191202751496555855872

or 777654321^2 + 1=604746242969971042

etc you get the idea. I selected the number randomly so it may be very hard to simplify or entirely lucky and indeed see a dramatic reduction in length. If so it was not intentional.

Lets say we can use the basic symbols like +-*/()^! and numbers 0-9 and maybe an extra separator type character like | to signify the end of a stream representation with operations and the beginning of the next etc . That way we may play with 1 billion digits in packets of 20 -30 or whatever seems convenient at the time.

Imagine we have massive computational power and plenty of time and huge local memory (database capacity) but memory restrictions in the finished representation (or it could be transmission length restrictions).

So the objective is now this;


Start with a stream of very long seemingly random integers. It can be length 20 blocks or length 100 or whatever or it can even be variable depending how we find patterns. Our objective is to transform them from the numerical 0-9 digital form (or whatever binary effective later) to a functional representation form that can be re-encoded again using 0-9.

Will we be able to compact the long stream of integers to something smaller with the above procedure?

Obviously some numbers are dramatically easy like 10^100 or 2^800 etc lol

Additionally we may try sums of cubes or squares or products of powers of primes+- corrective terms etc

In any case if the process works we may end up with less memory needed to store the numbers (but unlike jpeg say compression we lose nothing, its easily reversible just do the math lol).


Take now the resulting number stream that is possibly significantly shorter than the original and try to do the same one more time!!! LOL

Imagine some crazy effort that after repeated such transformations we end up in something ultra small that of course takes time to reproduce to original form but with computers its very fast because in the final encoded message there are all the instructions needed to rebuild the prior level compression and so on until we reach the final form that is only numbers and no symbols or operations.


Is there some universal limit in this process for random numbers? Or am i missing something important here?


Can it be that introducing the extra symbols above eg +-*/()^!| creating in effect a ~19 base system (feel free to find another useful symbol i missed and take it to 20-base) what we save from the very good cases like 56^17 kind of representation (that takes only 5 symbols in base 19 or 20 to represent something that takes 29 digits in base 10 system) or 788! etc we lose back eventually because we switched from 10 to 20 base and the majority of numbers are simply too complex to decode to something simple?


So basically i ask what is the avg information content of a long random integer???

Is it possible with repeated application of this scheme to arrive at some massive spectacular compression if we have enough computing power to search for patterns and simplify long streams in the manner described?
masque de Z is offline   Reply With Quote
Old 05-17-2012, 05:50 PM   #2
Pooh-Bah
 
lastcardcharlie's Avatar
 
Join Date: Aug 2006
Location: you got it
Posts: 4,011
Re: Information content of long streams of numbers

http://en.wikipedia.org/wiki/Kolmogorov_complexity
lastcardcharlie is offline   Reply With Quote
Old 05-17-2012, 05:54 PM   #3
Pooh-Bah
 
jewbinson's Avatar
 
Join Date: Jul 2009
Posts: 4,251
Re: Information content of long streams of numbers

Quote:
Originally Posted by lastcardcharlie View Post
owned.
jewbinson is offline   Reply With Quote
Old 05-17-2012, 06:21 PM   #4
veteran
 
masque de Z's Avatar
 
Join Date: Aug 2009
Location: Stanford, CA USA
Posts: 3,331
Re: Information content of long streams of numbers

Quote:
Originally Posted by jewbinson View Post
owned.
I suggest you read the link (had read before and today again) (but i will go over further again in time) and what i posted and re-examine the concept of "owned". Of course i am somewhat familiar with Kolmogorov complexity but i am looking into practical theorems about what i asked not general terminology and little connection. So the link is of no help directly to answer what i asked. Maybe some theorem in it is however, so i will check more.

If it is then show me where in the link there are exact results that matter to the thread and then i ask lastcardcharlie why he didnt take the time to point the exact theorems or other references in the first place ???
masque de Z is offline   Reply With Quote
Old 05-17-2012, 06:32 PM   #5
Pooh-Bah
 
jewbinson's Avatar
 
Join Date: Jul 2009
Posts: 4,251
Re: Information content of long streams of numbers

Dude, chill. It was just coz he posted a 1-word answer that I wrote that. I was joking...
jewbinson is offline   Reply With Quote
Old 05-17-2012, 06:54 PM   #6
Pooh-Bah
 
lastcardcharlie's Avatar
 
Join Date: Aug 2006
Location: you got it
Posts: 4,011
Re: Information content of long streams of numbers

Quote:
Originally Posted by masque de Z View Post
Of course i am somewhat familiar with Kolmogorov complexity...
Why of course? Is it such an important topic in physics?
lastcardcharlie is offline   Reply With Quote
Old 05-17-2012, 07:00 PM   #7
adept
 
Chasqui's Avatar
 
Join Date: Sep 2009
Location: location ,location.
Posts: 1,086
Similar questions have been asked in the Usenet compression newsgroup from the beginning of time. Some "regulars" have tried to push their methods for decades!

AFAIK the bottom line is that for all numbers/files/messages there is no free lunch. Compression can only be achieved for certain strings.

In terms of a small set of big numbers. There are some well known open compression challenges with particular sets of numbers deemed "random" out there, some big names have tried many approaches. If a google search fails, try searching the compression newsgroups.

Last edited by Chasqui; 05-17-2012 at 07:11 PM.
Chasqui is offline   Reply With Quote
Old 05-17-2012, 07:45 PM   #8
veteran
 
masque de Z's Avatar
 
Join Date: Aug 2009
Location: Stanford, CA USA
Posts: 3,331
Re: Information content of long streams of numbers

Quote:
Originally Posted by lastcardcharlie View Post
Why of course? Is it such an important topic in physics?
It might develop into one. Most certainly black holes entropy Shannon entropy etc are not exactly unrelated issues.

Look here for instance to see some activity to be further investigated for fun;

http://inspirehep.net/search?ln=en&l...rm=&rg=25&sc=0

One is clearly benefited by keeping an open attitude in modern physics.
masque de Z is offline   Reply With Quote
Old 05-17-2012, 07:46 PM   #9
Cooler than Sammy Hagar
 
Join Date: Aug 2005
Location: Salt Lake City
Posts: 19,743
Re: Information content of long streams of numbers

Sounds like a data compression algorithm question. I believe it has been shown that data compression can never "reliably" compress data, I don't remember exactly how it works but it's something like if there is some string x1 that can be compressed by a data compression scheme then there is some y1 that will actually take up more data under that same scheme.

Or to put it another way, if you're using a string of 20 characters to store a number, then the total set of numbers you can store is only r20, where r is the range of possible characters.

Using 20 characters of the " +-/*^(){}0123456789" range is going to be able to store more individual values than using 20 characters of the "0123456789" range, but beyond that you can't "buy" any new "slots," with the same character range a sequence of n characters will hold rn possible states that can map to no more than rn referents.

It's just that n boxes can only hold n values, you have no wiggle room for that. On the other hand, in things like file compression and image compression the data tends to follow certain patterns so in practice you can gain a benefit by using a compression scheme that simplifies the most commonly found patterns. But in principle, because the number of boxes is finite, if you move a pattern from a "long" box into a "short" box, then you must also move a pattern from a "short" box into a "long" box.
madnak is offline   Reply With Quote
Old 05-17-2012, 08:03 PM   #10
old hand
 
dessin d'enfant's Avatar
 
Join Date: May 2012
Posts: 1,435
Re: Information content of long streams of numbers

Quote:
Originally Posted by masque de Z View Post
Will we be able to compact the long stream of integers to something smaller with the above procedure?
In the average case, no. Almost all strings do not have patterns that can be exploited to compress them. The proof is very similar in concept to showing that almost all real numbers are transcendental.
dessin d'enfant is offline   Reply With Quote
Old 05-17-2012, 08:32 PM   #11
centurion
 
Join Date: May 2012
Posts: 172
Re: Information content of long streams of numbers

Quote:
Originally Posted by madnak View Post
Sounds like a data compression algorithm question. I believe it has been shown that data compression can never "reliably" compress data, I don't remember exactly how it works but it's something like if there is some string x1 that can be compressed by a data compression scheme then there is some y1 that will actually take up more data under that same scheme.

Or to put it another way, if you're using a string of 20 characters to store a number, then the total set of numbers you can store is only r20, where r is the range of possible characters.

Using 20 characters of the " +-/*^(){}0123456789" range is going to be able to store more individual values than using 20 characters of the "0123456789" range, but beyond that you can't "buy" any new "slots," with the same character range a sequence of n characters will hold rn possible states that can map to no more than rn referents.

It's just that n boxes can only hold n values, you have no wiggle room for that. On the other hand, in things like file compression and image compression the data tends to follow certain patterns so in practice you can gain a benefit by using a compression scheme that simplifies the most commonly found patterns. But in principle, because the number of boxes is finite, if you move a pattern from a "long" box into a "short" box, then you must also move a pattern from a "short" box into a "long" box.
I had a feeling this was the case, but couldn't prove it. Proof seems obv to me now, ty for that
Clue is offline   Reply With Quote
Old 05-17-2012, 09:11 PM   #12
veteran
 
masque de Z's Avatar
 
Join Date: Aug 2009
Location: Stanford, CA USA
Posts: 3,331
Re: Information content of long streams of numbers

Quote:
Originally Posted by dessin d'enfant View Post
In the average case, no. Almost all strings do not have patterns that can be exploited to compress them. The proof is very similar in concept to showing that almost all real numbers are transcendental.
Can you link to such proof? I mean because although i intuitively see what you mean and was afraid for it as my original post hinted, the connection with polynomials and rational coefficients is not exactly identical (but suggestive anyway) to me to the "space" of all possible operations i proposed including powers, factorials, prime numbers sums of powers, products etc.

Last edited by masque de Z; 05-17-2012 at 09:16 PM.
masque de Z is offline   Reply With Quote
Old 05-17-2012, 10:40 PM   #13
old hand
 
dessin d'enfant's Avatar
 
Join Date: May 2012
Posts: 1,435
Re: Information content of long streams of numbers

Quote:
Originally Posted by masque de Z View Post
Can you link to such proof? I mean because although i intuitively see what you mean and was afraid for it as my original post hinted, the connection with polynomials and rational coefficients is not exactly identical (but suggestive anyway) to me to the "space" of all possible operations i proposed including powers, factorials, prime numbers sums of powers, products etc.
See this paper by Shannon.

For a friendly intro, see wiki article on Shannon Entropy



I think the first link I gave is behind a pay wall....free version should be here

Last edited by dessin d'enfant; 05-17-2012 at 10:56 PM.
dessin d'enfant is offline   Reply With Quote
Old 05-17-2012, 10:54 PM   #14
veteran
 
masque de Z's Avatar
 
Join Date: Aug 2009
Location: Stanford, CA USA
Posts: 3,331
Re: Information content of long streams of numbers

Quote:
Originally Posted by dessin d'enfant View Post
See this paper by Shannon.

For a friendly intro, see wiki article on Shannon Entropy
Is this the same 1948 paper?

http://www.alcatel-lucent.com/bstj/v...tj27-3-379.pdf
masque de Z is offline   Reply With Quote
Old 05-17-2012, 11:10 PM   #15
old hand
 
dessin d'enfant's Avatar
 
Join Date: May 2012
Posts: 1,435
Re: Information content of long streams of numbers

Quote:
Originally Posted by masque de Z View Post
Yeah sorry....It's Shannon's 1948 paper "A Mathematical Theory of Communication"....I didn't realize initially that it was behind a paywall. My edited link (as does the one you provided) gives a free version.
dessin d'enfant is offline   Reply With Quote

Reply
      

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off



All times are GMT -4. The time now is 12:57 AM.


Powered by vBulletin®
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO 3.6.0 ©2011, Crawlability, Inc.
Copyright © 2008-2010, Two Plus Two Interactive