lexical analysis

This is the place for queries that don't fit in any of the other categories.

lexical analysis

Postby metulburr » Thu Aug 22, 2013 3:46 pm

my next project appears to be creating my own computer language. The part i get confused at is the point of lexical analysis. How exactly would you parse out a file containing your so called language into tokens, and convert it to its current language to execute is the part i am not sure about?

for example:
lets assume i considered this accurate (oh i dont know) Metulburrian, syntax lol.
Code: Select all
say("hello world")

#test():
    say('test')
 
test()


the same translating to python as:
Code: Select all
print(("hello world")
def test():
    print("test")

test()


would the easiest way to be to convert it to python, and then run it via python? AKA convert # to def and the function say to print, using indentation to signify blocks, etc. OR just skip python, and convert it to binary? But then how would you execute it?
New Users, Read This
OS Ubuntu 14.04, Arch Linux, Gentoo, Windows 7/8
https://github.com/metulburr
steam
User avatar
metulburr
 
Posts: 1560
Joined: Thu Feb 07, 2013 4:47 pm
Location: Elmira, NY

Re: lexical analysis

Postby micseydel » Thu Aug 22, 2013 4:22 pm

C++ compiles originally converted the code to C, and used those old compilers. If you want a binary, you don't want to convert your code to Python, but you could convert it to C or C++ (or you could create Java, or Java bytecode, etc.). If you want a binary you'll probably want at least type annotations or something as a stepping stone to dynamic typing, rather than pure dynamic typing.

If you want to see lexical analysis in action, I'd suggestion looking at Python's C source or PyPy source code. It would be nice since you'd already know the language they're working on.
Join the #python-forum IRC channel on irc.freenode.net!

Please do not PM members regarding questions which are meant to be discussed publicly. The point of the forum is so that others can benefit from it. We don't want to help you over PMs or emails.
User avatar
micseydel
 
Posts: 1488
Joined: Tue Feb 12, 2013 2:18 am
Location: Mountain View, CA

Re: lexical analysis

Postby metulburr » Thu Aug 22, 2013 9:17 pm

so your pretty much saying convert it to c++, and run it from there?
Code: Select all
#include <iostream>

void test(){
    std::cout << "test" << std::endl;
}

int main(){
    std::cout << "hello world" << std::endl;
    test();
}
New Users, Read This
OS Ubuntu 14.04, Arch Linux, Gentoo, Windows 7/8
https://github.com/metulburr
steam
User avatar
metulburr
 
Posts: 1560
Joined: Thu Feb 07, 2013 4:47 pm
Location: Elmira, NY

Re: lexical analysis

Postby micseydel » Thu Aug 22, 2013 9:23 pm

I didn't mean that's the only solution, but it's certainly a reasonable solution. Much easier and more portable than producing assembly, but still gets you your binary.
Join the #python-forum IRC channel on irc.freenode.net!

Please do not PM members regarding questions which are meant to be discussed publicly. The point of the forum is so that others can benefit from it. We don't want to help you over PMs or emails.
User avatar
micseydel
 
Posts: 1488
Joined: Tue Feb 12, 2013 2:18 am
Location: Mountain View, CA

Re: lexical analysis

Postby jkbbwr » Fri Aug 23, 2013 8:43 am

I have taken a great interest over the years, this is one method, its known as targeted translation, a syntax that compiles into another outform.

Another way is to write your own machine code emitting compiler or to write an interpreter then bootstrap that with a compiled language and execute your source.
jkbbwr
 
Posts: 17
Joined: Mon Feb 11, 2013 10:25 am

Re: lexical analysis

Postby Somelauw » Fri Aug 23, 2013 11:25 am

I think I would use an intermediate language instead of converting directly to byte or machine code, because then you can at least understand the output.
I'm not sure which intermediate language is best.
C++ has many features on one hand, but it's syntax isn't straightforward and it compiles slowly.
Lisp might be easiest as an intermediate language because it already looks somewhat like an abstract syntax tree and it's a dynamic language.
Join the #python-forum IRC channel on irc.freenode.net!
Somelauw
 
Posts: 75
Joined: Tue Feb 12, 2013 8:30 pm

Re: lexical analysis

Postby jkbbwr » Fri Aug 23, 2013 12:29 pm

As far as intermediate languages are concerned, you don't normally target another whole language. You normally target an agnostic executable format for example jvm instructions for the Java Virtual Machine, Python Bytecode, IL for the .NET framework. Imo these are readable and still great choices to target.
jkbbwr
 
Posts: 17
Joined: Mon Feb 11, 2013 10:25 am


Return to General Coding Help

Who is online

Users browsing this forum: Google [Bot], illuminatus, W3C [Linkcheck] and 3 guests