
The Gaming Den Welcome to the Gaming Den.

View previous topic :: View next topic 
Author 
Message 
Starmaker Duke
Joined: 07 Mar 2008 Posts: 2289 Location: Redmonton

Posted: Mon Oct 26, 2015 11:59 am Post subject: Programmer help needed 


2016's post here.
I'm going to be funemployed real soonish and currently searching for a new job. This is what they want a prospective employee to solve and optimize the solution:
Quote:  1. Given an array of points on a surface (defined by their Cartesian coordinates  "for the sake of simplicity, the coordinates are integer numbers"  this is verbatim), for each point, find the point closest to it and all points within double distance to the closest point. 
Quote:  2. Find the earliest appearance of a given sequence of digits in the sequence 1234567891011121314151617181920212223... 
I solved #2. It has fuck all to do with substring search.
What's a fast way to do #1? (The only improvement over the dumbest possible solution I can imagine is not using the square root.)
Last edited by Starmaker on Fri Sep 30, 2016 6:28 pm; edited 1 time in total 

Back to top 


RobbyPants Prince
Joined: 06 Aug 2008 Posts: 4276

Posted: Mon Oct 26, 2015 12:23 pm Post subject: 


Does #1 specify a number of points? If the number is large, there could be a way that involves sorting ahead of time to reduce search space later.
I had to do something kind of like #1 in my last job, but I don't know that it was the most optimal solution. I picked the solution I did because it involved analyzing a few hundred thousand points in 3D space, and you really only cared about any point relative to other points nearby.
So, the idea I picked was to take a bunch of time (and memory) up front sorting the data by it's 3D location, so I'd have a much smaller number of points to compare. I basically made a jagged 4D array, where the first indicies of the first 3 dimensions corresponded to what "cube" the point existed in, in 3D space, and the 4th dimension was a list of points in that "cube".
That being said,you might be able to do the same type of thing here with a 2D array. The index could list it's location, and you'd store a number of point in those coordinates. Alternately, it could store bools/bits just to say whether or not any points are at that location. Without the sort, it looks like you'd be doing a nested loop (O^2 complexity) to analyze the points, followed by another, shorter nested loop to sort your hits. If you were to to plop them all into a 2D array where each index corresponded to its actual position, for any given point you're checking, you would know which other points are within double distance, which would limit your search space. You would start at that "center" of that search space and look outward, breaking out of the search once you found a hit.
So, that would have O complexity for setting up the grid, and each search after that would be technically O^2, but for a much shorter list. So, this would be faster on a really long set of points, but would perform worse on a short list. If the problem doesn't specify, they probably are not looking for something like this.
Last edited by RobbyPants on Mon Oct 26, 2015 12:26 pm; edited 3 times in total 

Back to top 


DSMatticus Prince
Joined: 14 Apr 2011 Posts: 4827

Posted: Mon Oct 26, 2015 3:08 pm Post subject: 


The first is a space partitioning problem.
If you partition a large space, you can restrict your search only to the partitions which could possibly contain solutions. If I asked you if there were any pizza places within 25 miles of my house in Ohio, you would not look at pizza places in Texas. No point in Texas is within 25 miles of any point in Ohio. Texas  and the many many pizza places therein  are eliminated from consideration in one fell swoop simply by virtue of the fact that the partition is so far away it cannot possibly contain any solutions.
The space partitioning algorithm everyone comes up with on their own is the one RobbyPants describes, and the best way to visualize it is that you're grid papering your input. Once your input is divided into these neat, even, repeating cells, you need only consider the cells which might contain solutions; collectively those cells will contain some number of points, and you will perform the usual O(N^2) search on them.
The "robust" space partitioning algorithm is recursive partitioning. Divide the space in two. If there are still too many points in either half, divide that half in two. If there are still to many points in either half, divide that half in two. If there are still okay you know what recursion is I'm sure you've grasped the basic concept. Structurally, you arrange this like a binary (or quad, or oc) tree whose leaves are all of the final partitions (and lists of the points contained therein). To find partitions that might contain solutions, you just traverse the tree, remembering that if the parent cannot contain a solution none of its children can.
While the first type of algorithm will work in 99% of real world cases (and is used in 99.99% real world cases, even if it's one of the .99% in which it won't), do note that the second type of algorithm adapts to peculiarities of the input automatically (while the first one can just fail spectacularly as every input lands in the same cell). This is why it's the answer that gets you an A on exams and this is why it's the answer your prospective employer probably wants to see. So I suppose my advice is google quadtrees and get comfortable with them.
Last edited by DSMatticus on Mon Oct 26, 2015 3:11 pm; edited 1 time in total 

Back to top 


schpeelah KnightBaron
Joined: 08 Jun 2008 Posts: 504

Posted: Mon Oct 26, 2015 5:53 pm Post subject: 


Create a second array to hold point distances from the given point. Find the minimal point, saving the distances into the array, then get all points within double distance. Complexity: 2N comparisons and N calculations.
The partitioning would be OK if you could reuse it for multiple queries against the same array of points, at which point you need to know the acceptable tradeoff between setup for the first query and the time of each subsequent query.
No fancy tricks will help if the scenario is to only do this once. If you have a reasonable bound on max distance, you could make an array of stacks to sort points into for the second pass, but when your input is an array of pairs of integers in an arbitrary order nothing can save you from from calculating distance for each of them and comparing to find the minimum  the total complexity of everything else you do has to therefore beat simply reading all the distances again and comparing to be an optimization.
This problem might just be to find people who don't actually know the basics or try to overcomplicate things. My current job's interview programming task was IIRC to just implement insert sort in Java. Those things tend to be basic.
Last edited by schpeelah on Mon Oct 26, 2015 5:58 pm; edited 2 times in total 

Back to top 


RobbyPants Prince
Joined: 06 Aug 2008 Posts: 4276

Posted: Mon Oct 26, 2015 7:01 pm Post subject: 


schpeelah wrote:  Create a second array to hold point distances from the given point. Find the minimal point, saving the distances into the array, then get all points within double distance. Complexity: 2N comparisons and N calculations.  That doesn't make sense. Distance is relative from two points.
The problem explicitly says to calculate it per point.
Quote:  for each point, find the point closest to it and all points within double distance to the closest point. 
So, you're approach is 2N * N, where you're guaranteeing that each point must be compared to all others. Partitioning limits the search size. 

Back to top 


Ancient History Invincible Overlord
Joined: 18 Aug 2010 Posts: 10904

Posted: Mon Oct 26, 2015 7:23 pm Post subject: 


schpeelah is thinking in terms of vectors and vector addition, I think. _________________ The Unpublishable  Updates Fridays between midnight and midnight  http://wikithulhu.com 

Back to top 


schpeelah KnightBaron
Joined: 08 Jun 2008 Posts: 504

Posted: Mon Oct 26, 2015 8:10 pm Post subject: 


Ancient History wrote:  schpeelah is thinking in terms of vectors and vector addition, I think.  No, I just missed the part about it being for each point rather than just a single chosen point. 

Back to top 


Longes Duke
Joined: 04 Nov 2013 Posts: 2423


Back to top 


Starmaker Duke
Joined: 07 Mar 2008 Posts: 2289 Location: Redmonton

Posted: Thu Nov 05, 2015 11:37 am Post subject: 


edit:
So turns out the inperson test is a GROUP task. Well fuck me sideways.
Does anyone have any pointers to shitty advice for HR managers on how to conduct those? Like "watch interviewees carefully, whoever scratches his or her nose when talking was born when Mars was ascendant, hire them"?

Thank you, everyone!
I've been invited for an inperson interview. I have no idea how they conduct those  there'll be a test, but what it will entail and what sources will be available is anyone's guess. Google fails me; where it didn't fail me was to reveal they hire, like, 1.5 people per year. So my chances are not good.
But at least I understand recursion now.
(I'm never posting another Xzibit meme.)
Last edited by Starmaker on Thu Nov 05, 2015 12:17 pm; edited 1 time in total 

Back to top 


Lokey Journeyman
Joined: 13 Oct 2015 Posts: 105

Posted: Thu Nov 05, 2015 11:15 pm Post subject: 


They always vary, it's whatever the HR troll you land is looking for. Try company work philosophy webstuff especially stuff that's repeated a lot, section headers or figure annotations (probably in reverse order since HR trolls are lazy).
Do not do real work for free. Do not fling feces. Do not authoritatively answer questions you aren't sure about. After that, it's a crapshoot anywhere these days 

Back to top 


TarkisFlux Duke
Joined: 22 Jun 2008 Posts: 1142 Location: Magic Mountain, CA

Posted: Fri Nov 06, 2015 4:29 pm Post subject: 


I'm curious what your solution for #2 looked like Star. _________________ The wiki you should be linking to when you need a wiki link  http://www.dndwiki.org
Fectin: "Ant, what is best in life?"
Ant: "Ethically, a task wellcompleted for the good of the colony. Experientially, endorphins." 

Back to top 


Starmaker Duke
Joined: 07 Mar 2008 Posts: 2289 Location: Redmonton


Back to top 


Starmaker Duke
Joined: 07 Mar 2008 Posts: 2289 Location: Redmonton

Posted: Fri Sep 30, 2016 6:26 pm Post subject: 


New year and a new attempt! (Also, the chance of funemployment is now higher than ever: they forced me to take a fucking literacy test written by an illiterate teambuilding coach, which I failed.)
This is Stage 1, which only requires submitting a web form with numerical answers, no source code, no time constraints, and I already sent in my answers (kept clearing the cookie until it gave me an easy problem set). So this is for my personal edification only.
Quote:  Let f(n) be the smallest positive integer m such that m! (note the bang) is divisible by positive integer n (e.g. f(5) = 5, f(6) = 3, f(10) = 5).
Find the sum of f(n) for a large number of large consecutive integers (say, 990'000'000 to 1'000'000'000). 
What I did:
 factorized each number, e.g. 1'000'000'000 = 2^9 * 5^9
 for every prime factor p of every number, I found a multiple of said factor m(p) which together with its preceding multiples contains just enough total factors to make up p's power:
Code:  2 * 4 * 6 * 8 * 10 * 12 divides 2^9
1 + 2 + 1 + 3 + 1 + 2 = 10 (greater than 9)
5 * 10 * 15 * 20 * 25 * 30 * 35 * 40 divides 5^9
1 + 1 + 1 + 1 + 2 + 1 + 1 + 1 = 9 
Then the maximum of m(p) (in this case, the maximum of 12 and 40, which is 40) is the f(n) that we want. Sum them up.
However, factorization is still slooooooow, so the code runs for several hours. Brute force runs for a month. For comparison, the secondslowest problem takes 80 seconds.
Is there a faster way to do this? 

Back to top 


schpeelah KnightBaron
Joined: 08 Jun 2008 Posts: 504

Posted: Fri Sep 30, 2016 7:17 pm Post subject: 


How did you factorize? My first instinct would be a modified sieve of Eratosthenes which should be n^2, not all that slow everything considered.
In fact, let's ditch factorization. Compute consecutive m!, then add m to the answer for every multiple of m! not already marked in the giant bool array. 

Back to top 


Pixels Knight
Joined: 14 Jun 2010 Posts: 356

Posted: Fri Sep 30, 2016 8:08 pm Post subject: 


Yeah, you can attack from the factoral side with something like this:
Code:  public int calculate(int n)
{
long fact = 1;
for (int m = 2; m < n; m++)
{
fact = (fact * m) % n;
if (fact == 0)
return m;
}
return n;
}  However, it won't work well for primes or semiprimes. f(n)=n if n is prime, and f(n)=p if n=p*q for p, q prime with p>q. It's not too much effort to filter out primes with Fermat though, and your input numbers are small enough that it might not matter anyhow. 

Back to top 


schpeelah KnightBaron
Joined: 08 Jun 2008 Posts: 504

Posted: Fri Sep 30, 2016 9:03 pm Post subject: 


Pixels, isn't that the brute force?
Anyway, final idea (maybe). You want to calculate f(n) for some large number of ns < N.
Code:  int arr[N]; // presumed initialized to 0
int M = 1; // holds m!
for(int m = 2;M<N;m++){
M = M * m;
for(int i = M; i<N; i=i+M) {
if(arr[i] == 0){
arr[i] = m;
}
}
} 
Should be something like (log(n))^2 or better, I'm not sure how to denote an inverse of !. Except of course initializing the array and computing the final sum are linear. Seems good enough that I think it's a serious competitor even if you don't need the first 990'000'000 numbers. 

Back to top 


Pixels Knight
Joined: 14 Jun 2010 Posts: 356

Posted: Fri Sep 30, 2016 9:53 pm Post subject: 


You... really have not thought this through very thoroughly. Or tested it. Your algorithm (as written) will literally just fill the entire array with alternating 0's and 2's. f(n)=m means that n should divide m!, not the other way around.
Off the top of my head, I can't think of a good sieverelated solution. I'm sure there's a very elegant solution out there, but I'm far too rusty on my number theory for it to rise to mind.
With a primality test, some fiddling to deal with semiprimes of the form 2*p or 3*p, and a bunch of threading I can get the expected computation time (on my work box) on my solution down to about 15 hours. Not good enough that I'd want to ship it to customers, plenty good enough for a oneoff calculation with no time constraints.
Last edited by Pixels on Fri Sep 30, 2016 9:54 pm; edited 1 time in total 

Back to top 


Starmaker Duke
Joined: 07 Mar 2008 Posts: 2289 Location: Redmonton

Posted: Tue Oct 04, 2016 8:56 am Post subject: 


Round 2 has started, and there are two problems, as before.
Problem 1 is the rain trap in 3d.
Problem 2 is the good old "earliest appearance of sequence in string".
This time we're expected to use stdinstdout, so submission checking is probably semiautomated  great news. I was told Problem 2 was a hit last year meaning hardly anyone among the Round 3 inperson interviewees solved it, and if the trend holds, the checker will fail shitty solutions, there'll be fewer Round 3 contenders and I'll get an easy rejection because I'm so old I remember Raistlin. Still, it's more fun than literacy tests. 

Back to top 


Blasted KnightBaron
Joined: 26 May 2010 Posts: 722

Posted: Wed Oct 05, 2016 12:01 am Post subject: 


The thought of problem 2 being an issue is a bit jarring. Isn't it a standard algorithm? Maybe I'm too old and crusty for this stuff. _________________


Back to top 


Ice9 Duke
Joined: 07 Mar 2008 Posts: 1518

Posted: Wed Oct 05, 2016 3:11 am Post subject: 


I think he's referring to a sequence with the same characteristics as in the OP  one composed of ordered integers starting with 1.
Which can be done in log10(pattern) time if you just want a valid index. I'm not sure off hand what it would be if you wanted to find the earliest index  ie. "213" is at index 14.
Last edited by Ice9 on Wed Oct 05, 2016 3:20 am; edited 4 times in total 

Back to top 


Starmaker Duke
Joined: 07 Mar 2008 Posts: 2289 Location: Redmonton

Posted: Wed Oct 05, 2016 9:46 am Post subject: 


Ice9 wrote:  I think he's referring to a sequence with the same characteristics as in the OP  one composed of ordered integers starting with 1.
Which can be done in log10(pattern) time if you just want a valid index. I'm not sure off hand what it would be if you wanted to find the earliest index  ie. "213" is at index 14. 
Basically, you look at the number and see if you can detect a pattern. I'll put my code back up in two weeks when the submission window closes. I googled the rain trap, after all. 

Back to top 


Starmaker Duke
Joined: 07 Mar 2008 Posts: 2289 Location: Redmonton

Posted: Wed Nov 02, 2016 9:43 am Post subject: 


"We regret to inform you we cannot extend an invitation to the School to you, as you possess sufficient knowledge to apply for a junior developer vacancy without additional education."
wat
(edit: I phoned their HR person and they're apparently moderately enthusiastic about "keeping me in mind" for when they have openings and/or recommending me to their partners, in stark contrast to last year's "meh gtfo". Time to dust off the cv.)

Sequence:
https://github.com/Zephyrine/hh16/blob/master/seq.py
Explanation:
Click here to see the hidden message (It might contain spoilers) 1. Find the first natural number of the sequence which actually appears in the target pattern by generating sequence fragments of increasing "remoteness" and trying to match them to the pattern.
2. Calculate how many digits preceded it in the natural sequence.
3. Shift as needed.
To do 1,
for each possible length k of the target number from 1 to length_of_pattern
for each possible shift j of the target number from 0 to k1
take the digits in positions [j..k+j) of the pattern, assemble a segment of the natural sequence (adding and possibly subtracting 1s as needed) induced by that number and see if it matches the pattern.
In the inner cycle, if a mismatch is found, stop this iteration and try another j; if a match is found, record the inducing number and its shift j as a possible answer and return the smallest of possible answers when the inner cycle ends
(as soon as we have a candidate, we don't need to check for possible candidates of greater lengths since they're obviously further down the natural sequence).
Example:
91029112
k = 1
j = 0: 9 + 1 = 10 == 10, 10 + 1 = 11 != 29
k = 2
j = 0: 91 + 1 = 92 != 02
j = 1: 10  1 = 9 == 9, 10 + 1 = 11 != 29
k = 3
j = 0: 910 + 1 = 911 != 291
j = 1: 102  1 = 101 != ??9
j = 2: 029 is not a natural number
k = 4
j = 0: 9102 + 1 = 9103 != 9112
j = 1: 1029  1 = 1028 != ???9
j = 2: 0291 is not a natural number
j = 3: 29111 = 2910 == ?910, 2911+1 = 2912 == 2???, pattern ends, record candidate;
we're out of js for this k and we're done, return (2911, 3).
Count the digits preceding the natural number 2911 in the sequence: 9*1+90*2+900*3+(29111000)*4
subtract 3 for the shift,
add 1 (per problem statement, we need the position of the first digit of the pattern)
and stick a fork in it.
Rain trap:
(provided for public benefit, since half the suggested solutions on the interwebs for the 2d trap are wrong)
https://github.com/Zephyrine/hh16/blob/master/flood.py
Explanation:
Click here to see the hidden message (It might contain spoilers) If either dimension is smaller than 3, the island won't hold water at all, return 0 (the general algorithm works in this case, too, but the check is cheaper).
Set total volume to 0.
Add the border cells to a heap (sorted by height), mark as visited.
Take a cell off the heap (once we're out of heap, we're done, return total volume).
For each its unvisited neighbor, add max(0, current height  neighbor height) to total volume, mark it as visited and add to the heap with height set to max(current height, neighbor height) [greater of its height and water level; water that we know isn't going anywhere effectively solidifies].
(If an HR person looking for a programmer is miraculously reading this, go ahead and PM me. I can sleep on a windowsill as long as I don't have to take literacy tests.)
Last edited by Starmaker on Wed Nov 02, 2016 10:49 am; edited 2 times in total 

Back to top 




You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum

Powered by phpBB © 2001, 2005 phpBB Group
