TargetWoman - Information Portal for Women
TargetWoman - Portal for Woman

2006 September | Women Blog

Women Blog - Behind the scene information about running a leading women portal - from setting up the server to maximizing the visibility amongst the discerning decision making women.
 

Latent Semantic Analysis

Filed under: Managing a Portal — admin @ 8:25 am

Recently people in the know of things about search engines use a buzz word – LSI  - Latent Semantic Indexing. For example when you query Google for the phrase – women portal like this : ~women portal

http://www.google.com/search?num=100&hl=en&lr=&q=%7Ewomen+portal

You can see the tilde(~) before the search phrase to indicate that you want the latent semantic analysis to be turned on, Google will try to look for the logical extension of the supplied phrase – to put it in a simple manner. Actually, it is a lot more complicated than this.

One thing I like about the Internet is the way it offers level playing field for anyone whether you are a 900 lb Gorilla or a timid 6 lb Chihuahua. We have been tinkering with LSA for sometime now – about a couple of years. Our agenda is much simpler in nature – to deliver the right page from our thousands of pages of content for a given search phrase. You would have noticed in our main page and elsewhere a search box with some mumbo jumbo about Natural language navigation.

To tell you the truth with out much technicalities and hype, LSA ( Latent Semantic Analysis) is a simple behind-the-scene process by a computer program to figure out the concept behind the word phrase and identify the matching content. Most writers use different word or phrase to describe the same idea or concept. Even the most painstaking editing effort before the publication stage will not weed out the individual bias of the language to mean different things for different people. Editors can offer consistent style and language across the entire website – but can do little to bring homogeneity to the choice of words.

For those who are technically inclined this is what is called as synonymy – where many words exist to describe a singular idea. In contrast, Polyseme describes a word or phrase with multiple meanings – again a problem in our search engine approach. Some of the words people use to search might end up getting the wrong page.

Human languages are probably one of the most complicated issues to be handled by computers. Subtle nuances of the language are not so easy to quantify in objective terms. People often intuitively “arrive” at the intended meaning of the written word by the position of the words in relation to each other. Cues like modulation of the voice, emphasis placed on syllables etc, which exist in spoken words don’t exist in a written page.

The only cue left to analyze is the relative position of the words to each other and the frequency of occurrence in each article page. Most of our pages contain thousands of occurrences of common words, which receive less weight than the unique primary keyword phrases. Evidently, these ‘weighted’ phrases are factored in to our search engine along with their synonyms for classification.

To cut a long story short, we decided to use an extensive dictionary of English words to help our version of LSA. Sometimes it is really thrilling to see that our internal search engine delivers the most appropriate article for the search phrase with relative ease. On the other hand, equally it is stumped by a contrived phrase though the frequency of this occurring is relatively small.

The technology to negotiate the vast realms of human languages is still nascent and our LSA is still at beta.

To go back to the first example of the LSA concept where we used Google to  look for the semantically related words matching the phrase – women portal, you should see many occurrences of lady, woman, female and so on from the search results page. It is also in beta …

 

Setting up X server in Windows

Filed under: Managing Servers — admin @ 1:36 am

Administrator role of a leading women portal requires you to don various roles at the same time. To research, design, write, review and edit technical documentation is just not enough.
Amongst other things, you will have to demonstrate a strong understanding of professional web application concepts and techniques, such as servers and application integrations to designing and managing content.

Managing servers, fine-tuning them for peak performance and writing server control directives to get the maximum out of a server is almost routine in this line of work.
Towards this, you must have a development server, which matches the real world production server in closer details. We intend to share our experiences in this field – not as a means to trumpet our prowess but to show that we have much humbler origins.

Setting up X server in Windows
In this series, we start with setting up a X server in a Windows machine to work with a Linux development server. It was suggested that an easier way must be found to remotely manage several linux machines from – of all things , a Windows machine. It takes quite some time to walk across to the individual cubicles to check out why some files are not accessible to a select team whilst the rest of the team happily works with the common file server.
Of course most of the things can be managed with just command line interface through SSH, but not some GUI windows manager.
So it is essential that we need a X server to be setup which can connect to any pre-configured linux desktops with little fuss. It will be a No brainer if you are going to use another linux desktop. But here, it was decided that the webmaster’s windows machine needs the deft handling to act as the interface.
In computer parlance X Window System, commonly addressed as X11 or simply X provides windowing for bitmap displays. X windows as it is implemented in the *nix/Linux machines is based on client server technology. The communication between the server and client operates in a transparent manner on the network. In other words, the client and server may run in the same machine or be separated by miles and connected securely through the Internet by tunneling the connection.
We evaluated all options before arriving at a solution that would be easy to implement and manage. In our evaluation we decided to abide by the guiding principles of X as propounded by Bob Scheifler and Jim Gettys:

It is as important to decide what a system is not as to decide what it is. Do not serve all the world’s needs; rather, make the system extensible so that additional needs can be met in an upwardly compatible fashion.
The only thing worse than generalizing from one example is generalizing from no examples at all.
If a problem is not completely understood, it is probably best to provide no solution at all.

We decided to go with Cygwin (http://x.cygwin.com/) for this task. So we downloaded the setup.exe and installed the X window system into the target Windows machine. It is simple and requires no hand holding for its installation. We will elaborate more on implementing a SSH server using this cygwin in a later blog.
Suffice it to say that once we are done with the installation, you should be ready to set up the connectivity for the linux desktops.
You will find a batch file to start the X server as here G:\cygwin\usr\X11R6\bin\startxwin.bat or wherever you have installed the Cygwin.
Fire up the batch file and you should be presented with a white screen with a command prompt. You connect to the target linux desktop like so:
SSH –Y –l root 192.168.0.10
The above command uses SSH to establish tunneling for the X windows and logins as root to the destination linux desktop at 192.168.0.10
Supply the credentials as required and you will have logged in to the remote linux desktop.
Fire up the required application from the command prompt and you are done.
For example, you want to see the desktop, type nautilus and you will see the remote desktop in your Windows machine assuming that your remote desktop is a RadHat 9 and it is running Nautilus.
Happy remote computing …



© Copyright 2004-2009 Targetwoman All rights reserved.
All comments are moderated and require approval by the webmaster.

» 2006 » September Targetwoman Women Blog | Targetwoman Women Portal | Women Health Articles | Women Directory


TargetWoman - Informative Portal for Women