More about algorithms

Discuss just about anything else
Post Reply
User avatar
raman22feb1988
6-bit nerd
Posts: 73
Joined: September 16th, 2007, 1:00 am
Location: Chennai, India
Contact:

More about algorithms

Post by raman22feb1988 »

Solving so the cases of recurrences for algorithms complexity analysis

Master's Theorem
T(n) = aT(n/b) + f(n)
If f(n) = O(n^[(log_b)a-ε]) for some ε>0, then T(n) = Θ((log_b)a).
If f(n) = Θ(n^[(log_b)a]), then T(n) = Θ((log_b)a.log n)
If f(n) = Ω(n^[(log_b)a+ε]) for some ε>0, then T(n) = Θ(f(n)).

----------------------------------------------------------------------------------------------
Explanation:
You need to compare two functions, Θ(n^[(log_b)a]) and f(n).

Three cases
(1) T(n) = 5T(n/2)+Θ(n^2).
Compare Θ(n^[(log_2)5]) and Θ(n^2).
Θ(n^[(log_2)5]) is polynomially greater, so answer is Θ(n^[(log_2)5]).

(2) T(n) = 4T(n/2)+Θ(n^2)
Compare Θ(n^[(log_2)4])=Θ(n^2) and Θ(n^2).
Both are asymptotically equal, so that the answer is Θ(n^2.log n)

(3) T(n) = 3T(n/2)+Θ(n^2)
Compare so Θ(n^[(log_2)3]) with Θ(n^2).
In this case, Θ(n^2) is polynomially greater, so the answer is Θ(n^2).

------------------------------------------------------------------------------------------------------------
Cases where the Master Theorem cannot be applied so
T(n) = 4T(n/2) + Θ(n^2.log n)
Comparison: Θ(n^2) and Θ(n^2.log n).
Θ(n^2.log n) is asymptotically greater, so the Master Theorem fails so to hold.

-------------------------------------------------------------------------------------------------------------
More examples:
(1) Take this one:

T(n) = T(√n) + Θ(n).
Put n=2^m.
T(2^m) = T(2^(m/2)) + Θ(2^m).
Let S(m) = T(2^m).
S(m) = S(m/2) + Θ(2^m).
S(m) = Θ(2^m)
= Θ(n).

(2) Consider this recurrence:
T(n) = T(√n) + Θ(1).
T(2^m) = T(2^(m/2)) + Θ(1).
S(m) = S(m/2) + Θ(1).
= Θ(log m)
= Θ(log log n)

(3) T(n) = T(√n) + Θ(log n).
T(2^m) = T(2^(m/2)) + Θ(m).

S(m) = S(m/2) + Θ(m).
= Θ(m)
= Θ(log n)

(4) T(n) = 2T(√n) + Θ(log n).
T(2^m) = 2T(2^(m/2)) + Θ(m).
S(m) = 2S(m/2) + Θ(m).
= Θ(m log m)
= Θ(log n log log n)

(5) Next to try so is this thing
T(n) = T(n-1) + Θ(n^2).
By the brute force methods,
T(n)=T(n-1) + cn^2.
=T(n-2) + 2cn^2
=T(n-3) + 3cn^2
=..........=
= T(1) + (n-1)cn^2.
If it is up so that so, T(1)=k.
Then, thus, T(n) = k+(n-1)cn^2.
Or, that, T(n) = Θ(n^3).
The largest found ECM factor is 73 digits from M1181, which is
1808422353177349564546512035512530001279481259854248860454348989451026887

The largest SNFS factorization is 313 digits of M1039

The largest GNFS factorization is RSA 768 (232 digits)
User avatar
raman22feb1988
6-bit nerd
Posts: 73
Joined: September 16th, 2007, 1:00 am
Location: Chennai, India
Contact:

Re: More about algorithms

Post by raman22feb1988 »

Divide and Conquer

Divide and Conquer is the process of solving an algorithm by dividing it into two or more smaller
sub-problems of the same type, and later merging their results to get the solution of the big problem.

In fact, it is a recursive algorithm, in which call to the same problem is made every time with smaller
parameters. If the sub-problem is optimal, it is solved by using brute force techniques.

Some of the problems which are solved by Divide and Conquer are:
1. Binary Search
2. Big Integer Multiplication
3. Quick Sort
4. Merge Sort
5. Matrix Multiplication

Binary Search (start, end, x)
Middle = (Start + End) / 2
If x = Middle Element, then return x
If x > Middle Element, then call Binary Search (middle, end, x)
If x < Middle Element, then call Binary Search (start, middle, x)
If Start = End, then print "x not found"

So, we have that T(n) = T(n/2) + Θ(1),
which evaluates out to be T(n) = Θ(log n)

Big Integer Multiplication (X, Y)
Big Integer Multiplication can also be solved by using Divide and Conquer.
Let X and Y be integers with 2N digits each. We split them up into two numbers of N digits each, and multiply
them, and then try to merge them up, later on.

Let X = (10^N) A + B
and Y = (10^N) C + D
Then, we have that
XY = (10^2N) AC + (10^N) (AD + BC) + BD

Quick Sort
In quick sort, we initially have an unsorted array, and try to sort it. Usually in an array of N
elements, we take the last element as pivot, and we initially have i = 0 and j = 1.
We increment j every iteration, until the penultimate element. So, we always have that i < j.

For every iteration, if element j > last element, we simply advance j. But, if element j < last
element, then we swap element i with element j, such that the element smaller than the
pivot always comes to the left. After we traverse j upto the penultimate element, we bring
the last element next to element i.

Now that the position of the pivot element is fixed, elements left of it will be smaller, and
elements right of it will be larger. Now, we can apply quick sort to each of the smaller arrays.
Note that it will take one traversal to split up into two arrays each time, so Θ(n).

Time Complexity: In the best case, it will always split up into two arrays of equal size N/2.
So, the time complexity will be T(n) = 2T(n/2) + Θ(n), which gives T(n) = Θ(n log n).

In the worst case, it will always split up into arrays of size N-1 and 1. So, in this case, the
time complexity will be T(n) = T(n-1) + Θ(n), that is T(n)=Θ(n²).

In the average case, let it be arrays of size M and N-M. Then, T(n) = T(m) + T(n-m) + Θ(n).
That is, for example if M = N/3, then T(n) = T(n/3) + T(2n/3) + Θ(n), which will usually be
T(n) = Θ(n log n).

Merge Sort
In merge sort, we usually take an unsorted array containing 2^N elements, and try to sort
it by using Divide and Conquer methods. We split up the array into two arrays of the same
size, and later try to merge it up.

I hope that everyone will be familiar with merging two sorted arrays. We will take a pointer
to the first element for both the arrays, and compare both these elements. If the element
of some array is smaller, then we extract that element, and advance the pointer to the next
element of that array. Since this requires a single complete traversal of the entire arrays, for
each of the arrays, it will require Θ(n) operations. Partitioning up an array into two smaller
arrays of the same size, requires so Θ(1) operations. So that, we will have the recurrence
for this case, right out to be
T(n) = 2T(n/2) + Θ(n), which is T(n) = Θ(n log n).

Heap Sort
Though heap sort is not a divide and conquer algorithm, but I will include it with the other
sorting algorithms. Heap is an array data structure, which is viewed as a binary tree based
upon level order traversal. In a max-heap, the values of the parent nodes will always be
greater than the values of their left and right children.

So, we start up with an unsorted array, and insert the elements into the heap at the child
nodes from left to right. Any time, if the value of the parent node is lesser than its child,
then we swap the parent with the child, so that we will make the greater node to go up as
much as possible. By this way, we will have constructed a max-heap.

To sort the array, we need to extract the elements from the heap. The root element will
always have the greatest value, so we always delete the root element every time. If the
root element is deleted, the child with the greater value will take the place of the root
node, and all the nodes that are attached to this child, will be pushed upwards. So, we are
following level order traversal, if due to this movement, any node becomes empty, then the
right most child node will take the place of that vacancy.

Matrix Multiplication
Two big matrices can be multiplied by using Divide and Conquer, by dividing them each
time into smaller matrices, and then multiplying them, and then later on merging them up.

Suppose that we have 2 matrices,
A B E F
C D G H
and that their product is
P Q
R S
by normal matrix multiplication, we have that
P = AE + BG
Q = AF + BH
R = CE + DG
S = CF + DH
which requires so 8 matrix multiplications,
where that A, B, C, D, E, F, G, H are the smaller matrices than when compared to that within
the problem.

Certainly, that other operations such as addition will take up so with Θ(n²) time because for
each time, every element in the matrix of the size n×n needs so, to be summed up with the
other matrix of the same dimensions. So, we will have that the overall time complexity right out to be
T(n) = 8T(n/2) + Θ(n²), which yields out so, to be
T(n) = Θ(n³).

By using Strassen's algorithm, we will have that,
taking 7 new variables, which are used to hold the products of 7 smaller matrices.
They are:
T = AF - AH
U = AH + BH
V = CE + DE
W = DG - DE
X = AE + AH + DE + DH
Y = BG + BH - DG - DH
Z = AE + AF - CE - CF
and then we compute out the sums right up to be, straight away,
P = X + W - U + Y
Q = T + U
R = V + W
S = X + T - V - Z
Thus, finally we have the time complexity of the matrix multiplication based upon this algorithm,
to be so,
T(n) = 7T(n/2) + Θ(n²), or
T(n) = Θ(n^log_2 7).
The largest found ECM factor is 73 digits from M1181, which is
1808422353177349564546512035512530001279481259854248860454348989451026887

The largest SNFS factorization is 313 digits of M1039

The largest GNFS factorization is RSA 768 (232 digits)
User avatar
raman22feb1988
6-bit nerd
Posts: 73
Joined: September 16th, 2007, 1:00 am
Location: Chennai, India
Contact:

Re: More about algorithms

Post by raman22feb1988 »

The Knapsack Problem

Problem Statement: Given N objects, with weights W(1), W(2), W(3), ..., W(n), and their costs
P(1), P(2), P(3), ..., P(n). We have to choose the subset of N objects, with weight less than a
specific value, say W, with the maximum cost.

There are two variations of the Knapsack problem.
1. Fractional Knapsack problem: In which the quantity of each object chosen, can be a fraction,
between 0 and 1. This is solved by using the greedy approach.
2. 0-1 Knapsack problem: In which each object can either be chosen or not be chosen. So, the
amount of each object to be chosen, can thus only be so, either 0 or 1, in this case. This is
solved by using the dynamic programming methods.

So, thus, we will have to maximize
P(1)x(1) + P(2)x(2) + P(3)x(3) + ... + P(n)x(n)
subject to
W(1)x(1) + W(2)x(2) + W(3)x(3) + ... + W(n)x(n) ≤ W.

where x(i) is the variable, which can be anything between 0 and 1 in case of the Fractional Knapsack problem,
and either exactly 0 or 1 in the case of the 0-1 Knapsack problem.

Greedy Method
Greedy means that we will start with the best object, and then go on, move in order to the next good
objects, until we have so the solution.
For example,
1. In Huffman Coding, we start with the character that has minimum number of occurrences, and keep
them up as the leaves of the tree, and then move on to the next objects with the least frequencies.
2. In the Kruskal's algorithm or the Prim's algorithm for obtaining so the minimal spanning tree in a
graph, we start up so with the edges with the minimum weight, and then add on edges with the next
smaller weights, until we do not have so any cycle.
3. In the Dijkstra's algorithm for getting up with the shortest path between any two stations, we start
with the source, and then add on the edges with the minimum weight that is being available so, until
we reach so the destination, and then until we do not have any loops in that same graph, at all.

Dynamic Programming
The dynamic programming is the process of examining all the paths at first, by using some heuristics,
that is some clever way of choosing so a good path each time, and then finally arriving at the correct
solution, with the minimum number of times for backtracking.

For example, finding out the earliest starting time of a process and the latest finishing time by using
CPM-PERT is being solved so by using Dynamic Programming. We examine all the possible paths
till the end, by choosing the best path each time, and then we will backtrack from there back to the
beginning.

Finding out the optimal binary tree, for any given binary tree, can be solved up by using dynamic
programming. That is, we need to minimize the cost of the binary tree, where the nodes store the
value of the weight, and then the cost increases so with the height of the tree.

Fractional Knapsack Problem
The fractional Knapsack problem is being solved by using the greedy methods.
Consider the four objects with the following weights and costs:

Gold: 1 kg, $15
Copper: 5 kg, $10
Silver: 3 kg, $9
Bronze: 4 kg, $5
subject to a maximum weight of 8 kg.

In the Fractional Knapsack variation, we can choose partial quantity of any object.
By using the greedy methods, we can start up first with the:
1. Object with the highest cost
2. Object with the lowest weight
3. Object with the highest cost/weight ratio.

For the best results, we will always go for the object with the highest cost/weight ratio first.

We can accommodate upto 8 kg, so we will put up with 1 kg of gold first,
then 3 kg of silver, and then finally with 4 kg of copper. Hence, we will have the total cost
right out to be,
$15 + $9 + $8 = $32.

Notice that if we go so to for the object with the highest cost first, we will have that
1 kg of gold + 5 kg of copper + 2 kg of silver.
Total cost = $15 + $10 + $6 = $31.

In case, if we go for the object with the lowest weight first, we will have that
1 kg of gold + 3 kg of silver + 4 kg of bronze
Total cost = $15 + $9 + $5 = $29.

In any case, the optimal value is $32.

0-1 Knapsack Problem
In case, if the problem defines that each object is a single entity, and only allows so to either
choose an object or leave an object, then straight away:

By using the greedy method, starting up with the highest cost object first, we will so arrive at
1 kg of gold + 5 kg of copper
No more objects will accommodate in the bag, and thus, we will so have that
The total cost = $15 + $10 = $25.

Brute force methods, will certainly give up so with the solution, but it is expensive because we
will need to examine so with all the 2^N possibilities, for N objects, and thus needing so a
requisite of exponential time.

The best algorithm to solve this 0-1 Knapsack is by using the dynamic programming approach.

Thus, we will initially declare so an array A of size (Number of objects × The weight of the bag)
and then later on, we will calculate their values, right out to be, directly, from the following formula:

A[j] = 0, if either i = 0 or j = 0
A[j] = A[i-1][j], if W(i) > j
A[j] = max(A[i-1][j], P(i)+A[i-1][j-W(i)]), if W(i) ≤ j

Taking the weights upto the weight of the bag along the rows of the matrix, and then the number
of the object for the column, we will build up so with the following matrix.

0 1 2 3 4
0 0 0 0 0 0
1 0 15 15 15 15
2 0 15 15 15 15
3 0 15 15 15 15
4 0 15 15 24 24
5 0 15 15 24 24
6 0 15 25 25 25
7 0 15 25 25 25
8 0 15 25 25 29

Thus, by using the dynamic programming method for this problem, we will have so that
only the optimal solution for this problem is to be,
1 kg of gold + 3 kg of silver + 4 kg of bronze
That is, the total cost = $15 + $9 + $5 = $29.

Thus, notice so that only of the fact that the above problem says so that
tells us that, of the fact that
If we can select so upto the third object, and we have a bag which accommodates so upto 4 kg,
we can take in the objects upto $24.
In case that we can choose so upto the second object, with a bag for 6 kg, we can get in with the
things for upto $25.
And then finally, for a bag of size 8 kg, for upto the fourth object, we can accommodate upto the
items till $29.
The largest found ECM factor is 73 digits from M1181, which is
1808422353177349564546512035512530001279481259854248860454348989451026887

The largest SNFS factorization is 313 digits of M1039

The largest GNFS factorization is RSA 768 (232 digits)
User avatar
raman22feb1988
6-bit nerd
Posts: 73
Joined: September 16th, 2007, 1:00 am
Location: Chennai, India
Contact:

Generating Context Sensitive Grammars

Post by raman22feb1988 »

In the recent times, I have generated unrestricted grammars for some of the Context Sensitive
Languages. Here is the procedure to do so. Let me demonstrate it up, starting that with

a^(2^n)
S→AaCB
Ca→aaC
CB→DB
CB→E
aD→Da
AD→AC
aE→Ea
AE→ε

To generate any Unrestricted Grammar of this sort, we need to establish one non-terminal on the
left hand side and one non-terminal on the right hand side, which act as left and right bounds.
Here they are A and B. Then, we introduce a non-terminal C, which acts as the first state while
traversing inside A and B and to the right hand side. We start up with the terminal 'a' inside A and B,
which is the minimum string of the language that is possible. Every time the state C encounters 'a'
to its right, it replaces it by two 'a's. State D, the second state is for traversing to the left, and skips
past every 'a' without changing it at all. B changes C to D on the right and A changes D back to C on
the left. At the end, with enough number of 'a's, the non-determinism at CB, changes the state C to
the third state E, removes B, and traverses past all the 'a's. Finally, it removes the non-terminal A
with the non-terminal E to be null. In case there aren't enough number of 'a's, then the state C is
changed to the state D, and then the further branching is continued on.

a^(n^2)
S→LAYR
ZA→aAZ
Za→aZ
ZR→AAYR
aY→Ya
AY→YA
LY→LZ
YR→X
aX→Xa
AX→Xa
LX→ε

In this case, since n^2 = 1 + 3 + 5 + ... + (2n-1), so at any instance n = i, we will have (i-1)^2 A's
and (2i-1) a's between the left and right bounds, L and R respectively. State Z is for traversing to
the right, Y is for traversing to the left and X is for finally terminating with the branching, and it
converts all the A's to a's, as it traverses to the left finally. For the next instance n = i+1, we have
that (2i-1) A's are added up from the number of a's, as the state Z moves to the right. And at the
right most extreme, state Z with R, while converting to Y, 2 a's are added up at this end, so that
now, with n = i+1, we will have i^2 A's and (2i+1) a's between L and R. Note that if we add up only
one 'a' at the right end, instead of two, that is ZR→AYR instead of ZR→AAYR, then it will yield
the set of triangular numbers, a^(nC2).

a^(Fibonacci Number)
S→LBYR
YR→X
BX→Xa
AX→Xa
LX→ε
BY→YAB
AY→YA
LY→LZ
ZA→BZ
ZB→AZ
ZR→YR

Here, we use two variables A and B (instead of one) between L and R. Since we have that F(n+2)
= F(n) + F(n+1), thus at any instance, we will have F(n) A's and F(n+1) B's between L and R. While
traversing leftward with the state Y, we add up F(n+1) A's from the B's, and while traversing rightward
with the state Z, we replace all the A's with B's and all the B's by using A's, so that, thus we will have at
the next instance, F(n+1) A's and F(n+2) B's between L and R.

a^n!
S→LaYR
YR→X
YR→DBR
CD→DB
aD→Da
aT→Ta
AT→TA
LT→LZ
ZA→AZ
Za→AaZ
ZB→TC
ZC→CZ
CT→TC
ZR→YR
CX→X
AX→Xa
aX→Xa
LX→ε
AD→Da
LD→LU
UA→AU
Ua→aU
UB→TC

We want to multiply consecutively by 1, 2, 3, 4... At the beginning we have 1 'a' and Y is the state of the equivalent Turing Machine. So, each time we store 0, 1, 2, 3 copies of B's in the right. Notice that if we want to multiply 6 a's by 4, we have already 6 copies of a's, so we need 3 more such copies.
For every equivalent B, we establish 1 copy of n! a's and convert it to C. C means that copy is already established. Then a is converted to Aa. We need to be careful, if we convert a to aa, then for the next copy of B, for each a, we will have 2 + 2 = 4 a's, not 3, so be careful.
Notice that Z moving to the right, penetrates through C, not B. Once all B's are converted into C's, then YR is either converted to X for termination or Y is shifted leftwards for further branching. Generally, for each B, T is the symbol that moves leftward and Z for moving rightward, but once all B's are converted into C, and we need further branching, then all C's are converted back to B's with D as symbol moving leftwards, and an additional B is added up. Then all the A's are converted up into a's. We have already n! a's, go to multiplication by n+1. Once, a L is encountered to the left, then D changes state to U, and for the first time before the first B, moves rightwards.

a^Composite, Fast Version
S→HaC
C→aC
C→aT
T→ZU
AZ→ZA
aZ→ZAa
HZ→HY
YA→AY
Ya→aY
YU→T
T→X
aX→Xaa
AX→Xa
HX→ε

For recognizing the language, a^Composite, if we want N a's and that N = p.q, where p is a
prime factor of N, we keep the left bound H ready initially, and then branch out C until we have
'p' number of a's between H and T. Since, we don't know what the value of 'p' is, at the beginning,
we should try out every value of 'p' below N virtually, starting up from p = 2. Then, we branch out
T into the state Z for traversing leftward, and U which acts as the right bound. While traversing
leftward with the state Z, we add up 'p' number of A's from the a's, between H and U, as the state
Z travels from the right to the left. Since, the number of a's anytime between H and U, is unaltered,
and is equal to 'p', so we retain it and add a new non-terminal, A. Instead, if we use to add up 'a's
between H and U for each encounter of 'a' by the state Z, to its left, then the number of 'a's between
H and U will double each time, and we will arrive at the wrong results, to recognize the language
that is given by p.(2^r), so that, thus we will have to be careful in generating it up.
The state Y is for traversing to the right, skipping past every 'A' and 'a', leaving them as they are.
At the end, with having enough number of 'A's and 'a's, to stop branching, we convert YU into the
state X, which traverses leftward, converting all the 'A's and 'a's into 'a's, and then finally, thus, it
terminates the parsing of the string.

a^Composite, Short Version
S→HaC
C→aC
C→aT
aT→ZAaT
AZ→ZA
aZ→ZAa
HZ→H
aT→Xaa
aX→Xaa
AX→Xa
HX→ε

a^Composite, Simple Version
S→HC
C→aC
C→aT
aT→ZAaT
AZ→ZA
aZ→ZAa
HZ→HA
aT→Xaa
aX→Xaa
AX→Xa
HX→aa

a^n.b^n.c^n
S→aAbc
S→abc
Ab→bA
Ac→Bbcc
bB→Bb
aB→aaA
aB→aa

Here, the state A, which is initially between 'a' and 'b', moves after 'b' to its right, before 'c',
there it adds up 'bc' and changes into state B, moves before 'b' and then adds up 'a' between
'a' and 'b' and then changes back into A, that lies between 'a' and 'b'. This A, if it is final, can
also vanish at anytime, too. Thus, without including the non-terminals at all, we will always
have equal number of a's followed by the same number of b's and c's.

a^n.b^m.c^n.d^m
S→aAcd
A→aAc
A→B
B→bBZ
B→b
Zc→cZ
Zd→dd

At the beginning, the non-terminal A inserts equal number of a's and c's between the initial 'a'
and 'c', then it changes up into the state B. Thus, the state B releases on a 'b' to its left, but
for every 'b' that is released, it releases an equal number of Z to its right. The non-terminal Z,
every copy of that, traverses to the right past all the c's, and then releases an equal 'd' after
all the c's at the right end position, penultimate character, once a 'd' is encountered to its right.
Note that the final B releases a 'b' and then we have that initial 'd' within the beginning string,
so that the number of b's and d's at anytime, finally, will be balanced up so, thus.

ww, w ε {a, b}
S→aACD
S→bBCD
AC→CE
BC→CF
Ea→aE
Eb→bE
Fa→aF
Fb→bF
ED→YaD
FD→YbD
ED→Xa
FD→Xb
aX→Xa
bX→Xb
CX→ε
aY→Ya
bY→Yb
CY→ZC
Z→aA
Z→bB

Here, we will have to match on the equivalent string on the left, to the equivalent string to the right. So,
that since any string in this case, begins with 'a' or 'b', these terminals act over here as the left bound,
and then 'D' is the right bound. Assume that 'C' is the central character, and that we will have to place
the exact string to the left of 'C' in the same order, from left to right, to the right of 'C', between 'C' and
'D'. To do this, we will use several states, A is for reproducing 'a' after C, and it changes over its state,
then, to E, after it encounters C. B is used so for reproducing 'b' after C, and it changes over its state,
then, to F, after it encounters C. All these four states are for traversing to the right. Once these states
reproduces 'a' or 'b', then they change up over their state to X, if termination is needed up, the state X
will vanish along with C, at the end, finally. Or otherwise, they change up over their state to Y, that is
used so for traversing to the left side. The state X, which is also only used so to travel to the left, too.
Once the state Y encounters C onto its left, then it changes up over its state to Z, as it crosses up C.
This state is also meant only for travelling to the left, but it is not necessary at all. It immediately
encounters a terminal symbol to the left of C, so it changes up over its state back to A, used so for
travelling to the right hand side.
You do not have the required permissions to view the files attached to this post.
The largest found ECM factor is 73 digits from M1181, which is
1808422353177349564546512035512530001279481259854248860454348989451026887

The largest SNFS factorization is 313 digits of M1039

The largest GNFS factorization is RSA 768 (232 digits)
Post Reply