TYPES OF HOSTING

Internet hosting services can run Web servers; see Internet hosting services.

Hosting services limited to the Web:
A typical server "rack," commonly seen in colocation centres.

Many large companies who are not internet service providers also need a computer permanently connected to the web so they can send email, files, etc. to other sites. They may also use the computer as a website host so they can provide details of their goods and services to anyone interested. Additionally these people may decide to place online orders.

* Free web hosting service: offered by different companies with limited services, sometimes supported by advertisements, and often limited when compared to paid hosting.
* Shared web hosting service: one's website is placed on the same server as many other sites, ranging from a few to hundreds or thousands. Typically, all domains may share a common pool of server resources, such as RAM and the CPU. The features available with this type of service can be quite extensive. A shared website may be hosted with a reseller.
* Reseller web hosting: allows clients to become web hosts themselves. Resellers could function, for individual domains, under any combination of these listed types of hosting, depending on who they are affiliated with as a provider. Resellers' accounts may vary tremendously in size: they may have their own virtual dedicated server to a collocated server. Many resellers provide a nearly identical service to their provider's shared hosting plan and provide the technical support themselves.
* Virtual Dedicated Server: also known as a Virtual Private Server (VPS), divides server resources into virtual servers, where resources can be allocated in a way that does not directly reflect the underlying hardware. VPS will often be allocated resources based on a one server to many VPSs relationship, however virtualisation may be done for a number of reasons, including the ability to move a VPS container between servers. The users may have root access to their own virtual space. Customers are sometimes responsible for patching and maintaining the server.
* Dedicated hosting service: the user gets his or her own Web server and gains full control over it (root access for Linux/administrator access for Windows); however, the user typically does not own the server. Another type of Dedicated hosting is Self-Managed or Unmanaged. This is usually the least expensive for Dedicated plans. The user has full administrative access to the box, which means the client is responsible for the security and maintenance of his own dedicated box.
* Managed hosting service: the user gets his or her own Web server but is not allowed full control over it (root access for Linux/administrator access for Windows); however, they are allowed to manage their data via FTP or other remote management tools. The user is disallowed full control so that the provider can guarantee quality of service by not allowing the user to modify the server or potentially create configuration problems. The user typically does not own the server. The server is leased to the client.
* Colocation web hosting service: similar to the dedicated web hosting service, but the user owns the colo server; the hosting company provides physical space that the server takes up and takes care of the server. This is the most powerful and expensive type of web hosting service. In most cases, the colocation provider may provide little to no support directly for their client's machine, providing only the electrical, Internet access, and storage facilities for the server. In most cases for colo, the client would have his own administrator visit the data center on site to do any hardware upgrades or changes.
* Cloud Hosting: is a new type of hosting platform that allows customers powerful, scalable and reliable hosting based on clustered load-balanced servers and utility billing. Removing single-point of failures and allowing customers to pay for only what they use versus what they could use.
* Clustered hosting: having multiple servers hosting the same content for better resource utilization. Clustered Servers are a perfect solution for high-availability dedicated hosting, or creating a scalable web hosting solution. A cluster may separate web serving from database hosting capability.
* Grid hosting: this form of distributed hosting is when a server cluster acts like a grid and is composed of multiple nodes.
* Home server: usually a single machine placed in a private residence can be used to host one or more web sites from a usually consumer-grade broadband connection. These can be purpose-built machines or more commonly old PCs. Some ISPs actively attempt to block home servers by disallowing incoming requests to TCP port 80 of the user's connection and by refusing to provide static IP addresses. A common way to attain a reliable DNS hostname is by creating an account with a dynamic DNS service. A dynamic DNS service will automatically change the IP address that a URL points to when the IP address changes.

Some specific types of hosting provided by web host service providers:

* File hosting service: hosts files, not web pages
* Image hosting service
* Video hosting service
* Blog hosting service
* One-click hosting
* Pastebin Hosts text snippets
* Shopping cart software
* E-mail hosting service

CLUSTER COMPUTER

Cluster computing :

A computer cluster is a group of linked computers, working together closely so that in many respects they form a single computer. The components of a cluster are commonly, but not always, connected to each other through fast local area networks. Clusters are usually deployed to improve performance and/or availability over that of a single computer, while typically being much more cost-effective than single computers of comparable speed or availability.[1]
Contents


* 1 Cluster categorizations
o 1.1 High-availability (HA) clusters
o 1.2 Load-balancing clusters
o 1.3 Compute clusters
o 1.4 Grid computing
* 2 Implementations
o 2.1 Consumer game consoles
* 3 History
* 4 Technologies
* 5 See also
* 6 References
o 6.1 Further reading
* 7 External links

Cluster categorizations

High-availability (HA) clusters

High-availability clusters (also known as Failover Clusters) are implemented primarily for the purpose of improving the availability of services that the cluster provides. They operate by having redundant nodes, which are then used to provide service when system components fail. The most common size for an HA cluster is two nodes, which is the minimum requirement to provide redundancy. HA cluster implementations attempt to use redundancy of cluster components to eliminate single points of failure.

There are commercial implementations of High-Availability clusters for many operating systems. The Linux-HA project is one commonly used free software HA package for the Linux operating system.

Load-balancing clusters


Load-balancing is when multiple computers are linked together to share computational workload or function as a single virtual computer. Logically, from the user side, they are multiple machines, but function as a single virtual machine. Requests initiated from the user are managed by, and distributed among, all the standalone computers to form a cluster. This results in balanced computational work among different machines, improving the performance of the cluster system.

Compute clusters

Often clusters are used primarily for computational purposes, rather than handling IO-oriented operations such as web service or databases. For instance, a cluster might support computational simulations of weather or vehicle crashes. The primary distinction within compute clusters is how tightly-coupled the individual nodes are. For instance, a single compute job may require frequent communication among nodes - this implies that the cluster shares a dedicated network, is densely located, and probably has homogenous nodes. This cluster design is usually referred to as Beowulf Cluster. The other extreme is where a compute job uses one or few nodes, and needs little or no inter-node communication. This latter category is sometimes called "Grid" computing. Tightly-coupled compute clusters are designed for work that might traditionally have been called "supercomputing". Middleware such as MPI (Message Passing Interface) or PVM (Parallel Virtual Machine) permits compute clustering programs to be portable to a wide variety of clusters.

Grid computing
Main article: Grid computing

Grids are usually computer clusters, but more focused on throughput like a computing utility rather than running fewer, tightly-coupled jobs. Often, grids will incorporate heterogeneous collections of computers, possibly distributed geographically, sometimes administered by unrelated organizations.

Grid computing is optimized for workloads which consist of many independent jobs or packets of work, which do not have to share data between the jobs during the computation process. Grids serve to manage the allocation of jobs to computers which will perform the work independently of the rest of the grid cluster. Resources such as storage may be shared by all the nodes, but intermediate results of one job do not affect other jobs in progress on other nodes of the grid.

An example of a very large grid is the Folding@home project. It is analyzing data that is used by researchers to find cures for diseases such as Alzheimer's and cancer. Another large project is the SETI@home project, which may be the largest distributed grid in existence. It uses approximately three million home computers all over the world to analyze data from the Arecibo Observatory radiotelescope, searching for evidence of extraterrestrial intelligence. In both of these cases, there is no inter-node communication or shared storage. Individual nodes connect to a main, central location to retrieve a small processing job. They then perform the computation and return the result to the central server. In the case of the @home projects, the software is generally run when the computer is otherwise idle. U of C Berkley has developed an open source application BOINC to allow individual users to contribute to the above and other projects such as lhc@home (Large Hadron Collider) from a single manager which can then be set to allocate a percentage of idle time to each of the projects a node is signed up for. The Software can be downloaded and a project list can be found here BOINC

The grid setup means that the nodes can take however many jobs they are able to process in one session and then return the results and acquire a new job from a central project server.

CLUSTER ANALYSIS

Cluster analysis or clustering is the assignment of a set of observations into subsets (called clusters) so that observations in the same cluster are similar in some sense. Clustering is a method of unsupervised learning, and a common technique for statistical data analysis used in many fields, including machine learning, data mining, pattern recognition, image analysis and bioinformatics.

Besides the term clustering, there are a number of terms with similar meanings, including automatic classification, numerical taxonomy, botryology and typological analysis.


Types of clustering

Hierarchical algorithms find successive clusters using previously established clusters. These algorithms can be either agglomerative ("bottom-up") or divisive ("top-down"). Agglomerative algorithms begin with each element as a separate cluster and merge them into successively larger clusters. Divisive algorithms begin with the whole set and proceed to divide it into successively smaller clusters.

Partitional algorithms typically determine all clusters at once, but can also be used as divisive algorithms in the hierarchical clustering.

Density-based clustering algorithms are devised to discover arbitrary-shaped clusters. In this approach, a cluster is regarded as a region in which the density of data objects exceeds a threshold. DBSCAN and OPTICS are two typical algorithms of this kind.

Two-way clustering, co-clustering or biclustering are clustering methods where not only the objects are clustered but also the features of the objects, i.e., if the data is represented in a data matrix, the rows and columns are clustered simultaneously.

Many clustering algorithms require specification of the number of clusters to produce in the input data set, prior to execution of the algorithm. Barring knowledge of the proper value beforehand, the appropriate value must be determined, a problem for which a number of techniques have been developed.

Distance measure

An important step in any clustering is to select a distance measure, which will determine how the similarity of two elements is calculated. This will influence the shape of the clusters, as some elements may be close to one another according to one distance and farther away according to another. For example, in a 2-dimensional space, the distance between the point (x = 1, y = 0) and the origin (x = 0, y = 0) is always 1 according to the usual norms, but the distance between the point (x = 1, y = 1) and the origin can be 2, √2 or 1 if you take respectively the 1-norm, 2-norm or infinity-norm distance.

Common distance functions:

* The Euclidean distance (also called distance as the crow flies or 2-norm distance). A review of cluster analysis in health psychology research found that the most common distance measure in published studies in that research area is the Euclidean distance or the squared Euclidean distance.
* The Manhattan distance (aka taxicab norm or 1-norm)
* The maximum norm (aka infinity norm)
* The Mahalanobis distance corrects data for different scales and correlations in the variables
* The angle between two vectors can be used as a distance measure when clustering high dimensional data. See Inner product space.
* The Hamming distance measures the minimum number of substitutions required to change one member into another.

Another important distinction is whether the clustering uses symmetric or asymmetric distances. Many of the distance functions listed above have the property that distances are symmetric (the distance from object A to B is the same as the distance from B to A). In other applications (e.g., sequence-alignment methods, see Prinzie & Van den Poel (2006)), this is not the case. (A true metric gives symmetric measures of distance.)

Hierarchical clustering

Main article: Hierarchical clustering
Merge-arrow.svg
It has been suggested that this article or section be merged into Hierarchical clustering. (Discuss)

Hierarchical clustering creates a hierarchy of clusters which may be represented in a tree structure called a dendrogram. The root of the tree consists of a single cluster containing all observations, and the leaves correspond to individual observations.

Algorithms for hierarchical clustering are generally either agglomerative, in which one starts at the leaves and successively merges clusters together; or divisive, in which one starts at the root and recursively splits the clusters.

Any valid metric may be used as a measure of similarity between pairs of observations. The choice of which clusters to merge or split is determined by a linkage criterion, which is a function of the pairwise distances between observations.

Cutting the tree at a given height will give a clustering at a selected precision. In the following example, cutting after the second row will yield clusters {a} {b c} {d e} {f}. Cutting after the third row will yield clusters {a} {b c} {d e f}, which is a coarser clustering, with a smaller number of larger clusters.
Agglomerative hierarchical clustering


For example, suppose this data is to be clustered, and the euclidean distance is the distance metric.
Raw data

The hierarchical clustering dendrogram would be as such:

Traditional representation

This method builds the hierarchy from the individual elements by progressively merging clusters. In our example, we have six elements {a} {b} {c} {d} {e} and {f}. The first step is to determine which elements to merge in a cluster. Usually, we want to take the two closest elements, according to the chosen distance.

Optionally, one can also construct a distance matrix at this stage, where the number in the i-th row j-th column is the distance between the i-th and j-th elements. Then, as clustering progresses, rows and columns are merged as the clusters are merged and the distances updated. This is a common way to implement this type of clustering, and has the benefit of caching distances between clusters. A simple agglomerative clustering algorithm is described in the single-linkage clustering page; it can easily be adapted to different types of linkage (see below).

Suppose we have merged the two closest elements b and c, we now have the following clusters {a}, {b, c}, {d}, {e} and {f}, and want to merge them further. To do that, we need to take the distance between {a} and {b c}, and therefore define the distance between two clusters. Usually the distance between two clusters \mathcal{A} and \mathcal{B} is one of the following:

* The maximum distance between elements of each cluster (also called complete linkage clustering):

\max \{\, d(x,y) : x \in \mathcal{A},\, y \in \mathcal{B}\,\}.

* The minimum distance between elements of each cluster (also called single-linkage clustering):

\min \{\, d(x,y) : x \in \mathcal{A},\, y \in \mathcal{B} \,\}.

* The mean distance between elements of each cluster (also called average linkage clustering, used e.g. in UPGMA):

{1 \over {|\mathcal{A}|\cdot|\mathcal{B}|}}\sum_{x \in \mathcal{A}}\sum_{ y \in \mathcal{B}} d(x,y).

* The sum of all intra-cluster variance.
* The increase in variance for the cluster being merged (Ward's criterion).
* The probability that candidate clusters spawn from the same distribution function (V-linkage).

Each agglomeration occurs at a greater distance between clusters than the previous agglomeration, and one can decide to stop clustering either when the clusters are too far apart to be merged (distance criterion) or when there is a sufficiently small number of clusters (number criterion).

FOURTH-GENERATION LANGUAGES

General Use / Versatile
Agile Business Suite
Clipper
Cognos PowerHouse 4GL
DataFlex
Discovery Machine Modeler
Forté TOOL (transactional object-oriented language)
FoxPro
IBM Rational EGL (Enterprise Generation Language)
Lycia Querix 4GL
Omnis Studio SDK
Panther
PowerBuilder
SheerPower4GL (Microsoft Windows Only)
SQLWindows/Team Developer
Up ! 5GL
Visual DataFlex (Microsoft Windows Only)
WinDev
XBase++
Database query languages
FOCUS
Genero
SB+/SystemBuilder
Informix-4GL
Lycia Querix 4GL
NATURAL
Progress 4GL
SQL
Report generators
BuildProfessional
GEMBase
IDL-PV/WAVE
LINC
Metafont
NATURAL
Oracle Reports
Progress 4GL Query/Results
Quest
Report Builder
RPG-II
Data manipulation, analysis, and reporting languages
Ab Initio
ABAP
Aubit-4GL
Audit Command Language
Clarion Programming Language
CorVision
Culprit
ADS/Online (plus transaction processing)
DASL
Easytrieve
FOCUS
GraphTalk
IDL
IGOR Pro
Informix-4GL
LANSA
LabVIEW
MAPPER (Unisys/Sperry) now part of BIS
MARK-IV (Sterling/Informatics) now VISION:BUILDER of CA
Mathematica
MATLAB
NATURAL
Nomad
PL/SQL
Progress 4GL
PROIV
Lycia Hermes Querix 4GL
R
Ramis
S
SAS
SPSS
Stata
Synon
XBase++
SQR
Data-stream languages
APE
AVS
Iris Explorer
Database driven GUI Application Development
Action Request System
Genexus
SB+/SystemBuilder
Progress Dynamics
UNIFACE
Screen painters and generators
FOURGEN CASE Tools for Rapid Application Development by Gillani
SB+/SystemBuilder
Oracle Forms
Progress 4GL ProVision
Unify Accell
GUI creators
4th Dimension (Software)
eDeveloper
MATLAB's GUIDE
Omnis Studio
OpenROAD
Progress 4GL AppBuilder
Revolution programming language
Sculptor 4GL
Mathematical Optimization
AIMMS
AMPL
GAMS
Web development languages
ColdFusion
Wavemaker open source, browser-based development platform for Ajax development based on Dojo, Spring, Hibernate
OutSystems

FOURTH GENERATION PROGRAMING LANGUAGE

A fourth-generation programming language (1970s-1990) (abbreviated 4GL) is a programming language or programming environment designed with a specific purpose in mind, such as the development of commercial business software. In the evolution of computing, the 4GL followed the 3GL in an upward trend toward higher abstraction and statement power. The 4GL was followed by efforts to define and use a 5GL.

The natural-language, block-structured mode of the third-generation programming languages improved the process of software development. However, 3GL development methods can be slow and error-prone. It became clear that some applications could be developed more rapidly by adding a higher-level programming language and methodology which would generate the equivalent of very complicated 3GL instructions with fewer errors. In some senses, software engineering arose to handle 3GL development. 4GL and 5GL projects are more oriented toward problem solving and systems engineering.

All 4GLs are designed to reduce programming effort, the time it takes to develop software, and the cost of software development. They are not always successful in this task, sometimes resulting in inelegant and unmaintainable code. However, given the right problem, the use of an appropriate 4GL can be spectacularly successful as was seen with MARK-IV and MAPPER (see History Section, Santa Fe real-time tracking of their freight cars – the productivity gains were estimated to be 8 times over COBOL). The usability improvements obtained by some 4GLs (and their environment) allowed better exploration for heuristic solutions than did the 3GL.

A quantitative definition of 4GL has been set by Capers Jones, as part of his work on function point analysis. Jones defines the various generations of programming languages in terms of developer productivity, measured in function points per staff-month. A 4GL is defined as a language that supports 12–20 FP/SM. This correlates with about 16–27 lines of code per function point implemented in a 4GL.

Fourth-generation languages have often been compared to domain-specific programming languages (DSLs). Some researchers state that 4GLs are a subset of DSLs. Given the persistence of assembly language even now in advanced development environments who expects that a system ought to be a mixture of all the generations, with only very limited use of the first.

History

Though used earlier in papers and discussions, the term 4GL was first used formally by James Martin in his 1982 book Applications Development Without Programmers to refer to non-procedural, high-level specification languages. In some primitive way, IBM's RPG (1960) could be described as the first 4GL followed closely by others, such as the Informatics MARK-IV (1967) product and Sperry's MAPPER (1969 internal use, 1979 release).

The motivations for the '4GL' inception and continued interest are several. The term can apply to a large set of software products. It can also apply to an approach that looks for greater semantic properties and implementation power. Just as the 3GL offered greater power to the programmer, so too did the 4GL open up the development environment to a wider population.

In a sense, the 4GL is an example of 'black box' processing, each generation is further from the machine.It is this latter nature that is directly associated with 4GL having errors that are harder, in many cases, to debug. In terms of applications, a 4GL could be business oriented or it could deal with some technical domain. Being further from the machine implies being closer to domain. Given the wide disparity of concepts and methods across domains, 4GL limitations lead to recognition of the need for the 5GL.

The early input scheme for the 4GL supported entry of data within the 72-character limit (8 bytes used for sequencing) of the punched card where a card's tag would identify the type or function. With judicious use of a few cards, the 4GL deck could offer a wide variety of processing and reporting capability whereas the equivalent functionality coded in a 3GL could subsume, perhaps, a whole box or more of cards.
The 72-character metaphor continued for a while as hardware progressed to larger memory and terminal interfaces. Even with its limitations, this approach supported highly sophisticated applications.

As interfaces improved and allowed longer statement lengths and grammar-driven input handling, greater power ensued. An example of this is described on the Nomad page.

Another example of Nomad's power is illustrated by Nicholas Rawlings in his comments for the Computer History Museum about NCSS .He reports that James Martin asked Rawlings for a Nomad solution to a standard problem Martin called the Engineer's Problem: "give 6% raises to engineers whose job ratings had an average of 7 or better." Martin provided a "dozen pages of COBOL, and then just a page or two of Mark IV, from Informatics." Rawlings offered the following single statement, performing a set-at-a-time operation ...
The 4GL evolution was influenced by several factors, with the hardware and operating system constraints having a large weight. When the 4GL was first introduced, a disparate mix of hardware and operating systems mandated custom application development support that was specific to the system in order to ensure sales. One example is the MAPPER system developed by Sperry. Though it has roots back to the beginning, the system has proven successful in many applications and has been ported to modern platforms. The latest variant is embedded in the BIS offering of Unisys. MARK-IV is now known as VISION:BUILDER and is offered by Computer Associates.

Santa Fe railroad used MAPPER to develop a system, in a project that was an early example of 4GL, rapid prototyping, and programming by users.The idea was that it was easier to teach railroad experts to use MAPPER than to teach programmers the "intricacies of railroad operations".

One of the early (and portable) languages that had 4GL properties was Ramis developed by Gerald C. Cohen at Mathematica, a mathematical software company. Cohen left Mathematica and founded Information Builders to create a similar reporting-oriented 4GL, called Focus.

Later 4GL types are tied to a database system and are far different from the earlier types in their use of techniques and resources that have resulted from the general improvement of computing with time.

An interesting twist to the 4GL scene is realization that graphical interfaces and the related reasoning done by the user form a 'language' that is poorly understood.

FORTH GENERATION MICROPROCESSORS

After the integrated circuit, the only place to go was down - in size, that is. Large scale integration (LSI) could fit hundreds of components onto one chip. By the 1980's, very large scale integration (VLSI) squeezed hundreds of thousands of components onto a chip. The ability to fit so much onto an area about half the size of a U.S. dime helped diminish the size and price of computers. It also increased their power, efficiency and reliability. Marcian Hoff invented a device which could replace several of the components of earlier computers, the microprocessor. The microprocessor is the characteristic of fourth generation computers, capable of performing all of the functions of a computer's central processing unit. The reduced size, reduced cost, and increased speed of the microprocessor led to the creation of the first personal computers. Until now computers had been the almost exclusively the domain of universities, business and government. In 1976, Steve Jobs and Steve Wozniak built the Apple II, the first personal computer in a garage in California. Then, in 1981, IBM introduced its first personal computer. The personal computer was such a revolutionary concept and was expected to have such an impact on society that in 1982, "Time" magazine dedicated its annual "Man of the Year Issue" to the computer. The other feature of the microprocessor is its versatility. Whereas previously the integrated circuit had had to be manufactured to fit a special purpose, now one microprocessor could be manufactured and then programmed to meet any number of demands. Soon everyday household items such as microwave ovens, television sets and automobiles with electronic fuel injection incorporated microprocessors. The 1980's saw an expansion in computer use in all three arenas as clones of the IBM PC made the personal computer even more affordable. The number of personal computers in use more than doubled from 2 million in 1981 to 5.5 million in 1982. Ten years later, 65 million PCs were being used. Computers continued their trend toward a smaller size, working their way down from desktop to laptop computers (which could fit inside a briefcase) to palmtop (able to fit inside a breast pocket).



Fourth Generation Integrated Circuits

And a close-up...

Control Data Intebrid Fourth Generation IC's


Intel 8086 CPU


The Intel 8086 was based on the design of the Intel 8080 and Intel 8085, with a similar register set, but was expanded to 16 bits. It featured four 16-bit general registers, which could also be accessed as eight 8-bit registers, and four 16-bit index registers (including the stack pointer). The segment registers allowed the CPU to access 1 meg of memory. The 8086 was not considered a masterpiece, many of its features were extremely obtuse.

Intel Pentium CPU


Intel's superscalar successor to the 486 was introduced on March 22,1993. It has two 32-bit 486-type integer pipelines with dependency checking. It can execute a maximum of two instructions per cycle. It does pipelined floating-point and performs branch prediction. It has 16 kilobytes of on-chip cache, a 64-bit memory interface, 8 32-bit general-purpose registers and 8 80-bit floating-point registers. It is built from 3.3 million transistors on a 262.4 square mm die with ~2.3 million transistors in the core logic. Its clock rate is 66MHz, heat dissipation is 16W. In burst mode, the Pentium loads 256 bits of data into its 16K on-board cache in one clock cycleIt is called "Pentium" because it is the fifth in the 80x86 line. It would have been called the 80586 had a US court not ruled that you can't trademark a number. The successors are the Pentium Pro and Pentium II. A floating-point division bug was discovered in October 1994.